A paralog is a type of homologous gene, which are genes that share a common evolutionary ancestor. Paralogs are the product of a gene duplication event that creates a second copy of a gene within the genome of a single species. This process is somewhat like a publisher releasing two slightly different editions of the same book; both versions originated from the same manuscript but can be altered independently. Over time, these initially redundant genes can diverge, leading to different, though often related, functions.
The Origin of Paralogs
Paralogs arise from gene duplication, where a segment of DNA containing a gene is copied. The most frequent mechanism is an error during meiosis, the cell division for reproductive cells. This error, unequal crossing-over, occurs when homologous chromosomes misalign and exchange unequal segments of DNA. This results in one chromosome gaining a duplicated gene while the other loses it.
While unequal crossing-over is a primary source, other mechanisms can also generate paralogs. One such process is retrotransposition, where an mRNA molecule is reverse-transcribed back into DNA and then inserted into a new location in the genome. Both mechanisms create an extra gene copy that can then evolve independently.
Evolutionary Fate of Paralogous Genes
Once a gene is duplicated, only one copy is needed for the original function. This frees the second copy from previous selective pressures, allowing it to accumulate mutations that can lead to one of three main outcomes. One fate is pseudogenization, where the duplicated gene acquires mutations that render it non-functional, becoming a “fossil” within the genome.
A second outcome is neofunctionalization, where one gene copy mutates to develop a new function while the other performs the original role. The third possibility is subfunctionalization. Here, both genes accumulate mutations that cause them to specialize, dividing the ancestral functions between them so that together they perform the role of the single ancestral gene.
A classic example is the evolution of the globin gene family. An ancient duplication event in a vertebrate ancestor gave rise to the genes for myoglobin and hemoglobin. Myoglobin is found in muscle tissue, and hemoglobin is in red blood cells. Hemoglobin itself is composed of different globin proteins, such as alpha-globin and beta-globin, which are also paralogs from later duplication events that now work together.
Distinguishing Paralogs from Orthologs
A common point of confusion is the difference between paralogs and orthologs, as both are types of homologous genes. The distinction lies in the evolutionary event that created them. Paralogs arise from a gene duplication event within a single species. In contrast, orthologs are genes found in different species that diverged because of a speciation event, when one species splits into two. Orthologous genes typically maintain the same function in the different species.
For example, human hemoglobin and chimpanzee hemoglobin are orthologs. They diverged when the human and chimpanzee lineages split from a common ancestor. In contrast, human hemoglobin and human myoglobin are paralogs. They exist in the human genome due to a gene duplication event that occurred in an ancient ancestor, long before humans and chimpanzees diverged.