Complementary DNA, or cDNA, is a synthetic form of DNA used in molecular biology, created from a messenger RNA (mRNA) template through reverse transcription. Unlike RNA, which is unstable and easily degraded, cDNA is a robust, double-stranded molecule suitable for many molecular techniques. The conversion of an organism’s entire set of mRNA molecules, known as the transcriptome, into cDNA allows researchers to study gene expression and function.
Prerequisite: First Strand Synthesis
Before the second strand of cDNA can be created, the first strand must be synthesized from an mRNA template. The process begins with isolating mRNA from a cell or tissue sample, which represents all the genes actively being expressed at the time of collection. Once isolated, the mRNA serves as a template for an enzyme called reverse transcriptase, which reads the RNA sequence and synthesizes a complementary strand of DNA.
To start this synthesis, a short piece of DNA called a primer is required. Oligo-dT primers bind specifically to the poly-A tail found on most eukaryotic mRNAs, ensuring only mRNA is converted. Alternatively, random hexamers are short primers that bind at various points along the RNA, allowing transcription of RNAs that may lack a poly-A tail. For studies focused on a single gene, a gene-specific primer can be designed to target and transcribe only the mRNA of interest. The result of this first step is a hybrid molecule composed of the original mRNA strand and the newly synthesized single strand of DNA.
The Gubler-Hoffman Procedure
The most widely utilized method for synthesizing the second strand of cDNA is the Gubler-Hoffman procedure. The process begins with the DNA-RNA hybrid molecule produced during first-strand synthesis. The first enzyme in this procedure is RNase H, which specifically targets and degrades the RNA backbone of the hybrid molecule. RNase H does not completely obliterate the RNA strand; instead, it cleaves it into small fragments that serve as primers for the next step.
With the RNA primers in place, DNA Polymerase I is introduced into the reaction. This enzyme binds to the RNA fragments and begins synthesizing a new DNA strand, using the first cDNA strand as a template. As DNA Polymerase I moves along the template, it simultaneously removes the RNA primers ahead of it using its 5′ to 3′ exonuclease activity. This action is often described as nick translation.
This process leaves small gaps or “nicks” in the newly synthesized strand where the RNA primers were removed. To resolve this, a final enzyme, DNA Ligase, is added to the mixture. DNA Ligase works by forming phosphodiester bonds, sealing the nicks and creating a continuous, double-stranded cDNA molecule.
Alternative Second Strand Methods
While the Gubler-Hoffman procedure is common, other methods exist for synthesizing the second strand of cDNA. One technique is self-priming, where the 3′ end of the newly synthesized single-stranded DNA folds back on itself, forming a hairpin loop. This loop acts as a primer, allowing DNA polymerase to initiate synthesis of the second strand without external primers. After the second strand is complete, the hairpin loop must be removed using S1 nuclease, an enzyme that digests single-stranded nucleic acids. This cleavage step can sometimes result in the loss of sequence information at the 5′ end of the original mRNA.
Another alternative is the homopolymer tailing method. This process uses an enzyme called terminal transferase to add a string of identical nucleotides, such as a poly-G tail, to the 3′ end of the first cDNA strand. A complementary primer, in this case, an oligo-dC primer, is then introduced to anneal to this tail. This provides a starting point for DNA polymerase to synthesize the second strand.
Applications of Double-Stranded cDNA
The creation of double-stranded cDNA is a preparatory step for various molecular applications. One primary use is the construction of cDNA libraries. These are collections of all the expressed genes from a particular cell type or tissue, stored in a stable, clonable format. By inserting the ds-cDNA molecules into vectors like plasmids, researchers can create a resource that can be screened to identify and isolate specific genes.
For more focused research, a specific ds-cDNA molecule representing a single gene can be isolated and cloned. This allows for the production of large quantities of a particular gene for detailed study. For instance, the cloned gene can be inserted into expression vectors to produce its corresponding protein in bacterial or eukaryotic cells. This is a common approach for studying protein function, structure, and interactions.
Double-stranded cDNA is also the starting material for modern sequencing technologies, such as RNA-Seq. This high-throughput method allows for the sequencing of the entire transcriptome of a sample, providing a comprehensive snapshot of gene expression levels. Researchers can accurately quantify the abundance of each transcript, discover new gene variants, and analyze the complex landscape of gene regulation.