Genetic material, or DNA, is a code composed of building blocks called nucleotides. Changes in this code, known as mutations, can occur. Among the most impactful types are insertions and deletions, collectively termed “indels.” An insertion adds one or more nucleotides to the DNA sequence, while a deletion removes them. These events are a source of genetic variation and account for up to 25% of all natural genomic variations in humans. The presence of indels can alter how genes are read, leading to outcomes that range from harmless to life-altering. Understanding their origins and consequences provides insight into disease, adaptation, and evolution.
How Insertions and Deletions Occur
Insertions and deletions arise from several natural processes. A common source is DNA polymerase slippage, which happens during DNA replication. As the replication machinery copies the genetic code, it can slip on the template strand, particularly in areas with short, repeating DNA sequences. This slippage can cause the new strand to contain an extra copy of the repeated unit, creating an insertion, or miss one, creating a deletion.
This mechanism is frequent in genomic regions known as microsatellites, where the repetitive sequence makes stable pairing difficult. The result is that short indels are commonly found in these areas. The process is also biased toward deletions over insertions.
Another mechanism that generates indels is unequal crossing over during meiosis, the process that produces sperm and egg cells. When corresponding chromosomes from each parent pair up to exchange genetic material, they can misalign in regions with repetitive sequences. This can cause an unequal exchange where one chromosome receives an extra DNA segment (an insertion) while the other loses that segment (a deletion).
Mobile genetic elements, or transposons, can also cause indels. These are segments of DNA that can move from one location in the genome to another. When a transposon inserts itself into a new spot within a gene, it disrupts the gene’s sequence, causing a large insertion. The process by which these elements are removed can also be imprecise, leading to the deletion of adjacent DNA.
Altering the Genetic Code
The impact of an insertion or deletion depends on its size and location. The genetic code is read in three-letter “words” called codons, with each codon specifying a particular amino acid. When an indel occurs that is not a multiple of three nucleotides, it causes a frameshift mutation. This type of mutation shifts the entire reading frame of the gene from the point of the indel onward.
A frameshift mutation has profound consequences because it alters every codon that follows it. This leads to a completely different sequence of amino acids being incorporated into the protein. It often introduces a premature stop codon that truncates the protein, resulting in a nonfunctional or severely altered product.
In contrast, if the number of inserted or deleted nucleotides is a multiple of three, the reading frame remains intact. This is known as an in-frame indel. In this case, one or more amino acids are either added to or removed from the protein chain. While this can still affect the protein’s function, the consequences are often less severe than those of a frameshift mutation.
The outcome of an indel on the protein can range from a complete loss of function to a subtle change. A frameshift mutation almost always results in a nonfunctional protein. An in-frame indel might remove or add amino acids in a non-critical region of the protein with little effect. However, if the change occurs in an important part of the protein, such as its active site, it could still significantly alter its function.
Effects on Health and Evolution
Insertions and deletions are a cause of genetic disorders because the changes they introduce can lead to faulty proteins. This in turn disrupts normal cellular processes. For instance, cystic fibrosis is often caused by a specific in-frame deletion of three nucleotides. This leads to the loss of a single amino acid in the CFTR protein, preventing it from functioning correctly.
Similarly, certain forms of muscular dystrophy are caused by large deletions within the dystrophin gene that cause frameshifts. This leads to a nonfunctional protein that cannot maintain muscle cell integrity. Tay-Sachs disease is another disorder that can be caused by frameshift mutations resulting from small insertions.
However, not all indels are detrimental. They are also a primary source of the genetic variation that fuels evolution. By creating new versions of genes, indels can introduce new traits into a population. While many of these changes may be neutral or harmful, an indel can occasionally provide a selective advantage. A specific 32-base-pair deletion in the CCR5 gene, for example, provides resistance to HIV infection in individuals who carry it.
The impact of an indel is highly dependent on its context, including its location in the genome, its size, and the environmental pressures on the organism. Over evolutionary time, the accumulation of indels contributes to the divergence of species and the development of new biological functions. They are a constant source of novelty, shaping the genetic landscape of populations.
Identifying Indels in the Genome
Scientists use several techniques to detect indels in DNA, with the most comprehensive being DNA sequencing. Next-Generation Sequencing (NGS) technologies are effective because they can rapidly sequence large portions of the genome. This allows for a detailed comparison between a sample’s DNA and a reference sequence.
During this comparison, bioinformatic tools align the sequences to pinpoint differences. An insertion appears as an extra sequence that is absent in the reference, while a deletion is a segment of the reference missing from the sample. These methods can identify indels ranging from a single nucleotide to thousands of base pairs.
For detecting known indels in specific gene regions, researchers use the Polymerase Chain Reaction (PCR). This method amplifies the DNA region where an indel is suspected. By analyzing the size of the resulting PCR product, scientists can determine if a deletion has made the fragment shorter or an insertion has made it longer than expected.
Sanger sequencing can also find indels, especially when analyzing a single gene or a smaller region. Although slower than NGS, it provides a high-quality sequence that can reliably confirm an indel’s presence and exact size. These techniques allow researchers to build a comprehensive picture of how indels contribute to genetic variation and disease.