DNA matching has become a standard tool in forensic science and personal identification, offering a method to connect individuals to biological evidence. This technology relies on comparing genetic material from a sample to that of an individual at specific locations, or markers, across the human genome. Because nearly all human DNA is identical, scientists focus only on the small, highly variable segments to create a unique genetic fingerprint. The accuracy of this identification hinges entirely on the number of markers analyzed and how rare the resulting combination is in the general population.
Understanding Short Tandem Repeats
The specific regions of DNA used for identification are called Short Tandem Repeats (STRs). These are short sequences of two to seven base pairs that are repeated multiple times in a row at a particular location on a chromosome. For example, the sequence “AGAT” might be repeated 10 times in one person and 12 times in another person at the same location.
STRs are highly effective for identification because they are found in the non-coding regions of DNA, meaning variations in the number of repeats do not affect a person’s traits or health. Every individual inherits one copy of these repeats from each parent at each location. By measuring the length of the STR sequence at a given locus, scientists determine an individual’s genotype for that specific marker. Analyzing multiple STR locations generates a highly distinctive genetic profile.
The Standard Number of Markers in Identification Testing
The number of markers required for a reliable match is determined by the testing context and the governing standards of the scientific community. In the United States, forensic identification relies on standards set for the Combined DNA Index System (CODIS), managed by the Federal Bureau of Investigation. This system defines a set of core STR locations that must be analyzed to generate a profile for entry into the national database.
The standard for the CODIS system has evolved over time. Initially, the core standard established in 1997 required the analysis of 13 specific STR loci. As the number of profiles stored grew, the probability of two unrelated individuals sharing the same profile by chance increased, a situation known as an adventitious match.
To address this, the FBI expanded the CODIS core set to 20 autosomal STR loci, a change that took effect in January 2017. This increase significantly boosted the discriminatory power of the profiles. The expansion also increased compatibility with international DNA databases. Consequently, the current minimum requirement for a reliable forensic profile for database entry in the US is based on these 20 core markers.
Relationship Testing Standards
For relationship testing, such as paternity analysis, the standards may vary slightly but still demand a high number of markers for conclusive results. Many accredited laboratories use 16 to 21 STR markers for routine paternity tests to achieve the necessary high certainty. Some laboratories may analyze up to 45 or more markers, particularly in cases involving extended family relationships or when the available DNA sample is limited.
Beyond the Count: Calculating the Probability of a Match
Simply reporting that two DNA profiles match at 20 markers does not fully convey the reliability of the result; the true measure of certainty is statistical. The critical figure is the Random Match Probability (RMP), which represents the likelihood that a randomly selected person in a population would happen to have the exact same DNA profile as the sample. The RMP is the actual metric used in courtrooms to express the rarity of a genetic profile.
Scientists calculate the RMP by multiplying the known frequency of each specific STR pattern across all analyzed markers, a process called the product rule. For example, if a specific pattern at one marker occurs in 10% of the population, and a pattern at a second marker occurs in 5%, the chance of having both patterns is 0.10 multiplied by 0.05, or 0.5%. This multiplication is repeated across every marker in the profile.
Adding each new, independent marker exponentially decreases the final probability, ensuring the genetic profile becomes rarer with every step. Analyzing 20 or more loci reduces the RMP to an astronomical figure, often resulting in odds that are less than one in a quadrillion. This statistical certainty is what makes DNA evidence so powerful.