What Is Incomplete Lineage Sorting in Genetics?

In genetics, incomplete lineage sorting (ILS) describes a scenario where the evolutionary history of a single gene does not perfectly align with the overall evolutionary history of the species in which it resides. Imagine a family tree where a specific trait, like eye color, doesn’t follow the expected inheritance pattern, appearing in distant relatives instead of direct descendants. This phenomenon, also known as deep coalescence or retention of ancestral polymorphism, means that genetic variation within an ancestral species was passed down to descendant species in a way that creates a “gene tree” differing from the accepted “species tree.”

The Genetic Inheritance Puzzle

Incomplete lineage sorting begins with ancestral polymorphism, meaning that a population of an ancient species contained multiple versions, or alleles, of a single gene. When a speciation event occurs, where one ancestral species diverges into two or more new species, these pre-existing ancestral alleles are then passed into the newly formed lineages.

Due to the random nature of genetic drift, different descendant species may inherit varying combinations of these ancestral alleles. The concept of “coalescence” helps explain this process, tracing the history of gene lineages backward in time until they converge to a single common ancestral allele. In cases of incomplete lineage sorting, the point at which these different gene lineages coalesce actually predates the speciation events that gave rise to the distinct species being studied. This means that a particular gene’s lineage might merge further back in time than the species’ divergence, leading to a gene tree that appears to contradict the overall species tree.

Conditions That Promote Sorting Issues

One significant factor is rapid speciation, which happens when multiple species diverge from a common ancestor in quick succession. When these branching events occur closely together in time, there is insufficient time for ancestral gene variants to become “fixed” within each new lineage. Ancestral polymorphisms thus persist across the short intervals between speciation events, leading to a higher chance of a gene’s history not matching the species’ overall branching pattern.

A large ancestral population size also promotes incomplete lineage sorting. Larger populations tend to maintain a greater diversity of alleles for a longer duration, as genetic drift has a less pronounced effect on allele frequencies. This increased genetic diversity means there is a higher probability that multiple ancestral variants will persist through a speciation event and be inherited by different descendant species. The combination of rapid speciation and a large ancestral gene pool creates conditions where the random sorting of alleles can easily result in discrepancies between gene trees and the species tree.

Resolving Evolutionary Disagreements

Scientists observe gene tree-species tree discordance when analyzing genomic data, particularly in groups that experienced rapid diversification. A well-known example comes from the relationship among humans, chimpanzees, and gorillas.

While the overall species tree indicates that humans and chimpanzees are more closely related to each other than either is to gorillas, genetic studies reveal a surprising pattern. A substantial portion of the human genome, estimated to be around 15%, shows a closer genetic relationship between humans and gorillas than with chimpanzees. Similarly, about 15% of the genome aligns chimpanzees more closely with gorillas. This does not alter the established species relationship but rather reflects the inheritance of ancestral polymorphisms from a large, genetically diverse common ancestor of all three lineages. By analyzing these conflicting gene trees across the entire genome, scientists can identify regions affected by incomplete lineage sorting and account for this phenomenon when reconstructing accurate evolutionary histories.

Differentiating from Gene Flow

Understanding incomplete lineage sorting also involves distinguishing it from another process that can cause similar patterns of genetic discordance: gene flow, often observed as hybridization and introgression. Gene flow involves the transfer of genetic material between species after they have already diverged and become distinct entities. This can occur when individuals from two different species interbreed and produce fertile offspring, leading to the exchange of genes.

In contrast, incomplete lineage sorting is rooted in the random inheritance of genetic variation that was already present in a common ancestral population before any speciation events occurred. While both processes can result in a gene from one species appearing more closely related to a gene in another species than expected, they operate at different times in evolutionary history. Scientists use complex statistical methods and analyze genomic signatures to differentiate between these two phenomena. For instance, introgression often leaves distinct patterns of gene sharing that can be geographically localized or show specific asymmetries in gene transfer, whereas incomplete lineage sorting reflects the random sorting of ancient variation across the entire genome.

Did Chickens Really Come From Dinosaurs?

What Is Small Non-Coding RNA and Why Is It Important?

What Is the Process of Amino Acid Translation?