What Are Short Tandem Repeats (STRs) in DNA Analysis?

Short Tandem Repeats, or STRs, are small segments of DNA that are repeated multiple times throughout a person’s genetic material. These sequences, often two to seven base pairs long, vary in their number of repetitions from one person to the next. This variability in hypervariable regions of our DNA provides a unique genetic signature for each individual.

The power of STR analysis to create a distinctive profile has made it a primary tool for human identification. Its efficiency and the small amount of DNA required have made it a standard practice in labs worldwide, replacing older DNA typing methods.

How STRs Create a Unique Genetic Fingerprint

The utility of STRs for identification lies in their location. These repeating sequences are found in non-coding DNA, meaning they do not contain instructions for making proteins. This placement allows analysis to reveal a person’s identity without disclosing sensitive medical or trait information. The number of repeats at any given STR location, or locus, is highly variable among the population, making these regions effective for distinguishing between individuals.

For each STR locus, an individual inherits two versions, known as alleles, one from each biological parent. An allele represents the number of times the short DNA sequence is repeated at that locus. For instance, at a locus designated CSF1PO, where the repeating sequence is “AGAT,” a person might inherit an allele with seven repeats from one parent and an allele with eleven from the other. This results in a profile of “7, 11” for that specific locus.

While having the same number of repeats as someone else at a single locus is common, with alleles often shared by 5 to 20 percent of the population, the statistical power of STR analysis comes from examining multiple loci simultaneously. By analyzing a standard set of 13 to 20 or more different STR regions, a combined profile is generated. The probability of two unrelated individuals having the same number of repeats at all of these locations is astronomically low, creating a statistically unique profile.

The Laboratory Analysis Process

The laboratory process begins with the extraction of DNA from its source, which can be a blood sample, cheek swab, or trace cells left at a crime scene. The goal is to isolate a pure sample of DNA containing the STR regions. This multi-step process transforms a biological sample into a DNA profile.

Next, the small amount of extracted DNA is amplified using Polymerase Chain Reaction (PCR). PCR acts like a genetic photocopier, making millions of copies of the specific STR loci to ensure there is enough DNA for detection. The process uses sequence-specific primers, which are synthetic DNA strands that bind to the areas flanking the STRs. This ensures only the targeted variable segments are copied.

The final step involves separating and detecting the amplified STR fragments to determine their length, which is accomplished using a method called capillary electrophoresis. The copied DNA fragments are tagged with fluorescent dyes and passed through a thin tube containing a gel matrix. An electric current causes the DNA to move through the tube. Shorter fragments travel faster than longer ones, allowing scientists to measure the length of each STR allele and determine its number of repeats.

Major Applications of STR Analysis

STR analysis has become a foundational tool with several major applications in different fields.

It is used to compare DNA evidence from crime scenes with DNA from suspects or victims. A match can provide strong evidence linking an individual to a crime, while a mismatch can exclude a person from an investigation. This capability helps secure convictions and exonerate individuals who have been wrongly accused.
Forensic investigations are aided by large-scale DNA databases like the Combined DNA Index System (CODIS). This system stores STR profiles from offenders, arrestees, and crime scene evidence. Searching the database for matches can generate new investigative leads and help identify repeat offenders.
STR analysis is the standard for paternity and relationship testing. By comparing a child’s STR profile with that of the potential parents, parentage can be determined with a high degree of certainty. Since a child’s profile is a combination of their parents’ alleles, a test can confirm or exclude a biological relationship.
The analysis also extends into the medical field. While most STRs are in non-coding regions, some are located within or near genes and can be associated with genetic disorders. An expansion in the number of repeats within certain genes can cause disease, such as Huntington’s disease, which is caused by excessive “CAG” repeats. Analyzing this STR can diagnose the condition or determine an individual’s risk.

Interpreting a DNA Profile

The final output of an STR analysis is a DNA profile, which is a string of numbers representing an individual’s unique genetic signature. The report lists the standardized genetic loci tested, such as D5S818 or TPOX. Next to each locus, it shows two numbers representing the two alleles detected, corresponding to the number of repeats for each. For example, a partial profile might read: D5S818 (11, 12) and TPOX (8, 8).

When a DNA profile from evidence matches the profile of a suspect, the report includes a statistic known as the random match probability. This number conveys the estimated frequency of that specific STR profile in the general population. For a full profile, this probability can be incredibly small, such as one in a quadrillion or less, which provides a quantitative measure of the strength of the evidence.