AAV Integration: Implications for Long-Term Genomic Stability
Explore how AAV integration affects genomic stability, the role of host repair mechanisms, and the factors influencing site selection across serotypes.
Explore how AAV integration affects genomic stability, the role of host repair mechanisms, and the factors influencing site selection across serotypes.
Adeno-associated viruses (AAVs) are widely used in gene therapy for their ability to deliver genetic material with low immunogenicity. While AAVs primarily remain as episomes in the nucleus, they can integrate into the host genome at low frequencies, raising concerns about potential genomic disruptions in long-term therapeutic applications. Understanding the mechanisms and factors influencing AAV integration is crucial for assessing its safety.
AAVs have a compact, single-stranded DNA genome of approximately 4.7 kilobases within a non-enveloped icosahedral capsid. The genome contains two primary open reading frames (ORFs): rep, encoding proteins for replication and genome processing, and cap, encoding structural proteins forming the viral shell. Flanking these ORFs are inverted terminal repeats (ITRs), essential for genome replication, packaging, and integration.
The capsid, composed of VP1, VP2, and VP3 proteins in a 60-subunit structure, determines the virus’s tropism and stability. VP3 is the most abundant, followed by VP2 and VP1. The capsid’s external surface influences receptor binding and cellular entry, affecting transduction efficiency across different cell types. Variations in capsid protein composition create distinct AAV serotypes with unique tissue tropisms, which are leveraged in gene therapy to enhance targeting specificity and minimize off-target effects.
Once inside the host cell, the AAV genome is released into the nucleus, where it predominantly remains as an episome. However, the secondary structure of the ITRs, forming T-shaped hairpin loops, can facilitate integration by engaging host DNA repair mechanisms. This characteristic has been exploited in gene therapy, where modified AAV vectors lacking rep genes aim to minimize integration risks while maximizing episomal persistence.
The rep gene encodes four proteins—Rep78, Rep68, Rep52, and Rep40—produced through alternative splicing and differential promoter usage. Rep78 and Rep68 play key roles in genome replication, packaging, and site-specific integration. These proteins possess helicase and endonuclease activities, enabling AAV to integrate its DNA into the host genome under specific conditions.
Rep78 and Rep68 bind to the ITRs, facilitating replication and viral DNA resolution. They also mediate site-specific integration at the AAVS1 locus on human chromosome 19 by recognizing a conserved sequence, cleaving DNA, and facilitating strand transfer. This mechanism distinguishes AAV from many other integrating viruses that insert randomly, posing higher risks of insertional mutagenesis. In the absence of Rep proteins, integration is infrequent and occurs randomly across the genome.
Beyond integration, Rep proteins interact with host DNA repair pathways to support viral genome persistence. They bind to cellular transcription factors, potentially influencing gene expression, and exhibit ATP-dependent helicase activity, essential for unwinding DNA during replication and integration. While these functions benefit the natural AAV life cycle, they pose challenges for gene therapy, as Rep proteins in recombinant vectors increase integration likelihood, raising concerns about unintended genomic modifications.
Once inside the nucleus, the AAV genome relies on host DNA repair machinery for processing, as it lacks autonomous replication enzymes. The ITRs at the genome’s ends make it susceptible to repair mechanisms, which can either maintain it as an episome or facilitate rare integration events.
Non-homologous end joining (NHEJ) is the primary pathway involved in AAV genome processing. This mechanism repairs double-strand breaks by directly ligating DNA ends, sometimes resulting in AAV sequence incorporation into chromosomal DNA. NHEJ is error-prone, often causing small insertions or deletions at junction sites. In the absence of Rep proteins, NHEJ-mediated integration occurs sporadically, increasing genomic disruption risks. An alternative end-joining pathway, microhomology-mediated end joining (MMEJ), also contributes to integration, particularly in cells with defective classical NHEJ components.
Homology-directed repair (HDR), which requires a homologous template, plays a minor role in AAV genome processing. While HDR is favored in dividing cells, AAV rarely integrates through this mechanism unless a homologous sequence is introduced. This feature has been utilized in gene editing applications where AAV vectors are co-delivered with donor DNA templates for precise genome modifications. In natural infections or standard gene therapy vectors, HDR-mediated integration is negligible.
AAV integration into the host genome is rare, but certain genomic regions are more frequently affected. The most well-documented integration hotspot is the AAVS1 site on human chromosome 19. This locus, characterized by open chromatin and active transcription, is preferentially targeted in the presence of Rep proteins. The AAVS1 site lies within the PPP1R12C gene, which encodes a regulatory subunit of myosin phosphatase. Despite its role in cytoskeletal dynamics, disruptions at this site appear well-tolerated, making it a viable target for engineered gene insertion.
Beyond AAVS1, integration events occur in regions with active transcription, CpG islands, and repetitive elements. These sites are often associated with DNA double-strand breaks or stalled replication forks, suggesting that chromatin accessibility and repair processes influence AAV integration. Studies in animal models and human cells have identified recurrent insertions near genes involved in cell cycle regulation and tumor suppression, raising concerns about potential oncogenic effects. While random integrations are infrequent, the risk of insertional mutagenesis remains a consideration for long-term gene therapy.
Detecting AAV integration events requires highly sensitive molecular techniques due to their low frequency and potential random distribution. Several methods help map insertion sites, quantify integration rates, and assess genomic disruptions.
Linear amplification-mediated PCR (LAM-PCR) selectively amplifies host-virus junctions, identifying integration sites with high sensitivity. This method has been instrumental in uncovering integration hotspots but can introduce amplification biases. More advanced approaches, such as high-throughput sequencing combined with inverse PCR, offer greater resolution and reduce artifacts.
Whole-genome sequencing (WGS) provides an unbiased assessment of AAV insertions, capturing both site-specific and random events. While highly informative, WGS requires significant computational resources and deep sequencing coverage to detect rare integrations reliably. Complementary techniques like fluorescence in situ hybridization (FISH) and Southern blot analysis validate integration findings and provide spatial information on viral DNA within the nucleus. Combining these methods enhances the evaluation of AAV-based gene therapies.
Different AAV serotypes exhibit distinct integration behaviors due to variations in capsid structure, receptor binding, and intracellular trafficking. These differences influence transduction efficiency and integration likelihood, making serotype selection crucial for gene therapy.
AAV2, one of the most studied serotypes, preferentially integrates at the AAVS1 locus in the presence of Rep proteins. While this reduces random insertions, it raises concerns about disruptions to the PPP1R12C gene. Other serotypes, such as AAV8 and AAV9, exhibit lower integration frequencies and primarily maintain their genomes as episomes. AAV8-based vectors, for example, result in minimal genomic insertions, making them favorable for long-term therapeutic use where genomic stability is a priority.
Integration frequency also varies by tissue type and cellular environment. AAV6 shows higher integration rates in hematopoietic cells, while AAV5 has a preference for airway epithelial cells. These differences likely stem from how each serotype interacts with host DNA repair pathways, influencing whether the viral genome remains episomal or integrates. Understanding serotype-specific behaviors is essential for optimizing gene therapy strategies, balancing efficient gene delivery with minimal genomic modifications.