LTR Genetics: Impact on Gene Regulation and Disease
Discover how ancient viral sequences in our DNA, known as LTRs, act as key regulators of gene expression, influencing both human health and evolution.
Discover how ancient viral sequences in our DNA, known as LTRs, act as key regulators of gene expression, influencing both human health and evolution.
Long Terminal Repeats (LTRs) are repetitive DNA sequences that make up a significant fraction of many complex genomes, including our own. These sequences are remnants of ancient retrotransposons, a type of mobile genetic element that copies and pastes itself throughout the genome. LTRs are far from inactive; they can influence the expression of nearby genes, and their activity is linked to both evolutionary innovation and human disease.
Long Terminal Repeats are named for their position at each end of genetic elements called LTR retrotransposons. A full-length element is flanked by two nearly identical LTRs, each organized into three regions: U3, R, and U5. This U3-R-U5 arrangement is fundamental to the lifecycle of the retrotransposon.
The U3 (unique 3′) region contains promoter and enhancer sequences that initiate the expression of the retrotransposon’s genes. The R (repeated) region marks the start and end of transcription. The U5 (unique 5′) region holds sequences needed for processing the new RNA transcript and integrating it into the host’s DNA.
These structures originate from endogenous retroviruses (ERVs) and LTR retrotransposons. ERVs are descendants of ancient retroviruses that infected the germline cells of an organism’s ancestors, becoming a stable part of the host’s genetic landscape. LTR retrotransposons are related mobile elements that propagate through a “copy-and-paste” mechanism within a cell’s genome. In humans, these elements collectively make up approximately 8% of our DNA.
Integration begins when the DNA of an LTR retrotransposon is copied into an RNA molecule by the host cell’s machinery. This RNA transcript then serves as a template for the enzyme reverse transcriptase, which synthesizes a double-stranded DNA copy of the element. This process is the reverse of the normal flow of genetic information from DNA to RNA.
Once the double-stranded DNA copy is created, it is prepared for insertion into the host’s chromosomes. An enzyme called integrase, often encoded by the retrotransposon, cuts the host’s DNA at a random location. The integrase then pastes the new retrotransposon DNA into the gap, permanently embedding it in the host genome.
Over time, integrated elements can change. The two identical LTRs at either end of an element can be recognized by the cell’s DNA repair machinery. Through homologous recombination, the genetic material between the two LTRs can be deleted. This event leaves behind a single “solo” LTR, which is far more common in the human genome and still retains regulatory signals capable of influencing nearby genes.
The promoter sequences within an LTR’s U3 region can function as alternative start sites for transcription. If an LTR inserts just upstream of a host gene, it can take control of that gene’s expression. This can lead to its activation in new tissues or at different developmental times.
LTRs also contain enhancers, which are DNA sequences that increase the transcription levels of genes, even those located far away. These enhancer elements can loop around to make physical contact with a distant gene’s promoter, boosting its activity. This ability to act over long distances allows a single LTR insertion to alter the expression of multiple genes.
Beyond initiating or boosting transcription, LTRs can also bring it to a halt. Some LTRs contain polyadenylation signals, which are sequences that signal the cell to terminate the creation of an RNA molecule. If an LTR inserts within a gene, this premature termination signal can lead to the production of a truncated and likely non-functional protein.
The integration of LTRs has served as both a source of evolutionary novelty and a cause of disease. On an evolutionary timescale, the regulatory sequences within LTRs have been co-opted by host genomes to create new gene expression patterns. For example, LTRs are known to drive the expression of genes in mammalian oocytes and early embryos, contributing to the evolution of developmental programs.
The same mechanisms that drive evolution can also lead to disease. When an LTR inserts near a proto-oncogene, a gene that can cause cancer if its expression is increased, its promoter can drive uncontrolled cell growth. An LTR inserting into a tumor suppressor gene can also disrupt its function, removing a safeguard against cancer. This process is known as insertional mutagenesis.
The link between LTRs and disease is not limited to new insertions. The reactivation of ancient, dormant ERVs and LTRs has been implicated in a range of illnesses. In some autoimmune diseases, the expression of viral proteins from these elements may trigger an immune response against the body’s own tissues. There is also emerging evidence linking the reactivation of these elements to neurodegenerative conditions.