Retrotransposons, often called “jumping genes,” are mobile DNA segments that move and multiply within the host genome. These elements utilize an RNA intermediate for mobility, setting them apart from other genetic segments. Retrotransposons are highly prevalent, constituting approximately 42% of the human genome and even higher proportions in some plant species, like maize. They are fundamental components of genetic architecture, influencing genome size and organization. Understanding these elements is important because their movement can alter gene function and contribute to both disease and evolutionary change.
Defining Retrotransposons and Their Key Classes
Retrotransposons are classified as Class I transposable elements because their movement requires transcription into RNA before conversion back into DNA for insertion. This mechanism relies on reverse transcription and contrasts with Class II DNA transposons, which move directly as DNA segments in a “cut-and-paste” manner. This replicative nature allows retrotransposons to maintain the original copy while creating a new copy elsewhere, rapidly amplifying their numbers throughout the genome.
These elements are broadly divided into two major classes based on their structure. LTR (Long Terminal Repeat) retrotransposons are flanked by long, identical sequences at both ends and structurally resemble retroviruses. These LTRs contain the regulatory sequences necessary for transcription and integration. The second, more abundant class in humans is Non-LTR retrotransposons, which lack these terminal repeats.
Non-LTR retrotransposons are categorized into two main families: LINEs and SINEs. LINEs (Long Interspersed Nuclear Elements) are autonomous, encoding the proteins necessary for their own transposition, including reverse transcriptase and an endonuclease. The most active LINE in humans, LINE-1 (L1), makes up around 17% of the human genome. SINEs (Short Interspersed Nuclear Elements) are shorter, non-autonomous elements. SINEs, such as the common Alu element, rely on the enzymatic machinery provided by LINEs to mobilize themselves.
The Copy-and-Paste Mechanism
The mobility of retrotransposons is a “copy-and-paste” mechanism: the original element remains in place while a new copy is generated and inserted elsewhere. The process begins when the retrotransposon DNA is transcribed into an RNA intermediate (mRNA) by the host cell’s machinery. This RNA leaves the nucleus to be translated into specialized proteins, primarily reverse transcriptase and endonuclease.
The retrotransposon RNA and its newly synthesized proteins form a ribonucleoprotein complex (RNP) that re-enters the nucleus. Inside the nucleus, the RNP’s endonuclease targets a new genomic location and creates a single-strand break in the DNA. This process, known as target-primed reverse transcription (TPRT), is unique to Non-LTR retrotransposons. The exposed 3′-hydroxyl end of the cleaved host DNA acts as a primer for the reverse transcriptase, which synthesizes a complementary DNA strand directly from the retrotransposon RNA template.
As the new DNA copy is created, the RNA template is degraded. The host cell’s DNA repair and replication enzymes then synthesize the second strand of DNA. The result is a new, double-stranded DNA copy of the retrotransposon integrated into the host genome, allowing the element to amplify its numbers.
Disruptive Impacts on Gene Function
Retrotransposon activity is often disruptive, leading to genetic instability and disease. The most direct damage is insertion mutagenesis, which occurs when a new copy integrates directly into a functional gene. If the insertion lands within a protein-coding region, it introduces a premature stop signal or causes a frameshift, resulting in a non-functional or truncated protein. These de novo insertions are responsible for a range of single-gene disorders in humans, with over 120 cases linked to retrotransposon insertions.
Insertions of the active LINE-1 element, for example, can disrupt tumor suppressor genes like Retinoblastoma 1 (RB1), contributing to specific cancers. Similarly, Alu element insertions into genes such as BRCA1 and BRCA2, which are important for DNA repair, have been implicated in hereditary breast and ovarian cancers. Beyond coding regions, an insertion into an intron can alter messenger RNA splicing, leading to aberrant protein variants.
The high number of repetitive retrotransposon copies also creates sites for ectopic recombination. Because these copies are highly similar in sequence, they can cause chromosomes to misalign during meiosis. Recombination between these non-allelic copies results in large-scale chromosomal rearrangements, such as deletions, duplications, or inversions. These changes in genomic structure are often deleterious, contributing to intellectual disability and various congenital syndromes.
Evolutionary Roles and Genome Shaping
Although retrotransposons can be immediately disruptive, their long-term activity has been instrumental in shaping the architecture and function of eukaryotic genomes. Their “copy-and-paste” method is a primary driver of genome size variation, rapidly increasing the overall amount of DNA in a species. The vast number of accumulated retrotransposon fragments serves as raw material for genetic innovation.
A primary role is their contribution to regulatory innovation by introducing new gene expression control elements. Retrotransposons carry their own regulatory sequences, such as promoters and enhancers, needed for their own transcription. When a retrotransposon inserts near a host gene, these internal sequences can be “co-opted” by the host cell. These acquired elements can act as alternative promoters, providing a new start site for transcription, or as enhancers, switching on a nearby gene in a tissue-specific manner.
This mechanism allows for the rapid evolution of species-specific gene expression patterns without altering the protein-coding sequence. For example, certain LTR retrotransposons have been repurposed as enhancers that regulate gene expression uniquely in embryonic stem cells. Retrotransposons also facilitate gene duplication events, sometimes capturing parts of host genes and leading to the formation of new genes, which increases genetic novelty.
Cellular Mechanisms for Silencing Retrotransposons
Host cells recognize the threat posed by mobile retrotransposons and have evolved sophisticated defense mechanisms to keep them inactive, particularly in the germline. The primary control method is epigenetic silencing, which chemically modifies the retrotransposon DNA to prevent transcription. This is achieved through DNA methylation, where methyl groups are added to cytosine bases within the sequence.
Methylation marks the retrotransposon as inactive and is often paired with specific histone modifications, such as H3K9me3. These modifications cause the DNA to be tightly wound into heterochromatin, physically preventing transcriptional enzymes from accessing the retrotransposon DNA.
A second defense layer involves RNA interference (RNAi) pathways that target retrotransposon transcripts post-transcriptionally. In mammalian germ cells, this relies on Piwi-interacting RNAs (piRNAs). These small non-coding RNAs bind to Piwi proteins, forming a complex that recognizes and cleaves the messenger RNA produced by active retrotransposons. This mechanism destroys the RNA intermediate, cutting the retrotransposition cycle short and preventing new DNA copies.