What Is Cleavage and Polyadenylation in Gene Expression?

In the journey from a gene encoded in DNA to a functional protein, the messenger RNA (mRNA) undergoes several modifications. Within the cell’s nucleus, cleavage and polyadenylation acts as a finishing step for most of these mRNA molecules. This two-part procedure first cuts the new RNA transcript at a specific point and then adds a long, repetitive tail of adenine bases. This modification converts a preliminary pre-mRNA molecule into a mature, stable transcript, prepared for export from the nucleus and translation into a protein in the cytoplasm.

The Machinery and Steps of mRNA Processing

The process of 3′-end cleavage and polyadenylation is guided by specific sequences within the pre-mRNA transcript. The cellular machinery identifies a key polyadenylation signal, a six-nucleotide sequence of AAUAAA, which typically lies 10 to 30 nucleotides upstream of where the RNA will be cut. Further downstream from the cleavage site, a U- or GU-rich element also plays a part in the stable assembly of the processing machinery.

Once these signals are transcribed, a large collection of proteins assembles on the transcript. The Cleavage and Polyadenylation Specificity Factor (CPSF) complex is responsible for directly recognizing and binding to the AAUAAA signal sequence. Shortly thereafter, the Cleavage Stimulation Factor (CstF) complex binds to the downstream U/GU-rich element.

The cooperative binding of CPSF and CstF forms a stable platform on the pre-mRNA. This assembly also includes other cleavage factors, like CFI and CFII, and a scaffolding protein called symplekin, which helps organize the components. This arrangement of proteins bends the RNA, bringing the key signal sequences into proximity and preparing the transcript for the cleavage reaction.

With the machinery firmly in place, the cleavage event is executed. An endonuclease enzyme within the CPSF complex, known as CPSF-73, cuts the pre-mRNA chain between the upstream AAUAAA signal and the downstream U/GU-rich element. This action liberates the main body of the mRNA from the rest of the transcript that is still being synthesized, creating a new 3′ end.

Following the cut, an enzyme called Poly(A) Polymerase (PAP) is recruited to the complex. PAP then synthesizes a long chain of adenine nucleotides, adding them to the newly created 3′ end. This occurs in two phases: an initial slow addition of about 12 adenine residues, followed by a rapid phase that can extend the tail to 100-250 nucleotides, stimulated by the Nuclear Poly(A) Binding Protein (PABPN1) which coats the growing tail.

Functions of the Poly(A) Tail

The newly synthesized poly(A) tail performs several functions related to the lifespan and utility of the mRNA molecule. One of its primary roles is to protect the mRNA from degradation. The cytoplasm is filled with exonucleases, enzymes that degrade RNA molecules from their ends, and the poly(A) tail acts as a protective buffer, shielding the protein-coding sequence from this enzymatic destruction.

The length of the poly(A) tail is directly correlated with the stability of the mRNA. Over time, in a process known as deadenylation, the tail is gradually shortened by cytoplasmic enzymes. Once the tail is reduced to a certain minimal length, the mRNA becomes a target for rapid degradation, effectively ending its functional life.

Beyond protection, the poly(A) tail is instrumental in getting the mRNA out of the nucleus. For a mature mRNA to be translated, it must first be transported through the nuclear pore complex into the cytoplasm. The poly(A) tail, along with its associated Poly(A)-Binding Proteins (PABPs), is recognized by nuclear export machinery, serving as a quality-control checkpoint that only properly processed mRNAs are permitted to exit.

Once in the cytoplasm, the poly(A) tail plays an active role in initiating protein synthesis. The PABP bound to the tail can interact with proteins at the 5′ cap of the mRNA, including the initiation factor eIF4G. This interaction creates a circular structure, bringing the beginning and end of the mRNA molecule close together. This closed-loop formation promotes the efficient recruitment of ribosomal subunits, allowing for repeated rounds of translation.

Alternative Polyadenylation as a Regulatory Layer

The process of cleavage and polyadenylation is not always fixed for a given gene. Many genes contain more than one potential polyadenylation site, giving rise to a regulatory mechanism known as alternative polyadenylation (APA). This means the cellular machinery can choose to cleave and polyadenylate the transcript at different locations, producing multiple distinct mRNA isoforms from a single gene.

The most common consequence of APA is the generation of mRNAs with different 3′ untranslated region (3′ UTR) lengths. The 3′ UTR is the portion of the mRNA that follows the stop codon of the protein-coding sequence. When an upstream polyadenylation site is used, it results in an mRNA with a shorter 3′ UTR, while using a downstream site produces a transcript with a longer 3′ UTR.

These differences in 3′ UTR length are significant for gene regulation. The 3′ UTR is a hub for regulatory elements, containing binding sites for microRNAs and various RNA-binding proteins (RBPs) that can influence the mRNA’s stability and translation efficiency. By selecting a proximal polyadenylation site, a transcript can eliminate downstream regulatory sites, potentially escaping repression by certain microRNAs or RBPs and leading to higher protein production.

While changing the 3′ UTR is the primary outcome, a less common form of APA can directly alter the protein that is produced. This occurs when an alternative polyadenylation site is located within an intron, a region normally spliced out. If this intronic site is used, it causes premature termination of the transcript, leading to a truncated protein that may have a different function or be non-functional.

Consequences of Dysregulation in Human Health

Given the importance of precise cleavage and polyadenylation, errors in this process can lead to human disease. The system’s integrity relies on both the signal sequences within the pre-mRNA and the protein factors that make up the processing machinery. Mutations that disrupt either of these components can result in faulty mRNA processing, leading to the production of abnormal or insufficient amounts of protein.

One example is found in certain forms of thalassemia, a group of inherited blood disorders characterized by reduced hemoglobin production. In some cases, the disease is caused by a single nucleotide mutation directly within the AAUAAA polyadenylation signal of the beta-globin gene. This mutation prevents efficient recognition and cleavage at the correct site, causing the resulting elongated and unstable mRNA to be processed incorrectly, which leads to a severe deficiency of beta-globin protein.

The dysregulation of alternative polyadenylation has also been implicated in cancer. Many cancer cells exhibit a global trend towards 3′ UTR shortening by preferentially using proximal polyadenylation sites. The resulting shorter transcripts may lack binding sites for tumor-suppressing microRNAs, allowing cancer cells to increase the expression of oncogenes that drive proliferation. Altered expression of core polyadenylation factors, such as CSTF2, has been linked to these APA shifts across various cancers.

DLK1: Functions, Roles in Health, and Unique Regulation

Locus Control Region: A Master Regulator of Gene Activity

How Does Cooperation in Nature Work?