In genetics, an insertion is a mutation where additional nucleotide bases are added into a DNA sequence. The number of nucleotides added can range from a single base to an entire piece of a chromosome. For example, imagine the sentence “THE BIG DOG RAN.” If an extra letter, ‘X’, is added, the sentence becomes “THE BXI GDO GRA N,” which changes the structure and meaning of the original message. This illustrates how adding genetic material can disrupt the DNA sequence. This change is permanent and can be passed down through cell division.
The Mechanics of Genetic Insertion
One way insertions happen is through errors during DNA replication. “Strand slippage” can occur when the new strand of DNA being synthesized misaligns with the template strand. This misalignment can create a small loop of unpaired bases in either the new or the template strand. If this loop happens on the new strand, the replication machinery may add extra nucleotides, leading to an insertion. This type of error is more common in areas of the genome with repetitive DNA sequences.
Another mechanism involves mobile genetic elements, sometimes called “jumping genes” or transposons. These are segments of DNA that can move from one location in the genome to another. Some transposons use a “cut-and-paste” method, where the element is moved from its original spot to a new one. Others use a “copy-and-paste” mechanism, creating a duplicate that then inserts elsewhere. Viruses can also cause insertions by integrating their own genetic material into the host cell’s DNA.
Consequences of an Insertion
The impact of an insertion depends on its location and the number of bases added. DNA is read in three-base groups called codons, with each codon corresponding to a specific amino acid, the building blocks of proteins. If the number of inserted bases is not a multiple of three, it causes a frameshift mutation. This alters the grouping of bases into codons from the insertion point onward.
A frameshift scrambles the genetic message downstream of the mutation. For example, the phrase “THE FAT CAT ATE THE RAT” read in three-letter words makes sense. If an ‘F’ is inserted after the first word, the reading frame shifts, resulting in “THE FFA TCA TAT ETH ERA T,” which is nonsensical. This often leads to the creation of a premature stop codon, signaling the cell to stop building the protein too early. The resulting protein is typically truncated and nonfunctional.
Conversely, if the number of inserted nucleotides is a multiple of three, it is an in-frame insertion. This does not shift the reading frame for the rest of the gene. Instead, it adds one or more new amino acids to the protein chain. This is generally less severe than a frameshift mutation because the majority of the protein sequence remains unchanged. The resulting protein might still be partially or even fully functional.
Insertions and Human Health
Insertion mutations are a cause of several human genetic disorders. An example is Huntington’s disease, a progressive neurodegenerative disorder. This condition is caused by a trinucleotide repeat expansion in the HTT gene. In this gene, a three-base sequence, “CAG,” is repeated multiple times. While a certain number of repeats is normal, individuals with Huntington’s have an excessive number, leading to an abnormally long and toxic protein that damages nerve cells.
Cystic fibrosis, a disease affecting the lungs and digestive system, can also result from insertion mutations. Cystic fibrosis is caused by mutations in the CFTR gene, which provides instructions for making a protein that transports chloride ions across cell membranes. While the most common mutation is a deletion, various insertion mutations also disrupt the gene. These insertions can prevent the CFTR protein from functioning properly, leading to the thick, sticky mucus characteristic of the disease.