The theory of evolution describes how populations of organisms change over time, leading to the diversity of life through descent from a common ancestor. This concept, originally built on observations of anatomy and the fossil record, finds its most powerful validation in the molecular structure of deoxyribonucleic acid (DNA). DNA serves as the biological blueprint, storing the instructions necessary for the development, function, and reproduction of all known life forms. Molecular biology transforms the theory of evolution into a measurable, predictable science by providing a direct look at life’s underlying code.
The Universal Genetic Code
The most profound molecular evidence for a single origin of life is the near-perfect uniformity of the genetic code across all organisms. Virtually every living thing—from bacteria to plants and animals—uses the same four nucleotide bases: Adenine (A), Thymine (T), Cytosine (C), and Guanine (G). These bases are read in three-letter sequences, known as codons, to specify one of the twenty common amino acids used to build proteins.
For instance, the codon “TGG” codes for the amino acid Tryptophan in a human, a pine tree, and a single-celled alga. This shared instruction manual suggests a common ancestry for all life on Earth, as independent evolution of the same code is highly improbable. The universality of this code indicates that the mechanism for translating DNA into protein was established in the Last Universal Common Ancestor (LUCA) before life diversified. The fact that a bacterium can read a human gene and produce the corresponding human protein demonstrates that the fundamental language of life is shared.
Shared Genomic Features Among Species
Comparing DNA sequences between different species provides a molecular family tree that mirrors relationships predicted by fossil evidence and comparative anatomy. The principle is simple: the more closely related two species are, the fewer differences exist in their genomes. This comparative genomics approach measures evolutionary distance precisely.
For example, the difference in the protein-coding regions of human and chimpanzee DNA is about 1.2 percent, resulting in approximately 98.8 percent overall similarity. This genetic overlap is consistent with the estimated divergence time of six to seven million years ago. Comparisons also involve specific genes that maintain similar functions across vast evolutionary distances, such as the Pax6 gene, which controls eye development in organisms from humans to fruit flies. The similarity in the DNA sequence of this gene across diverse species shows its deep ancestral origin.
Non-coding, non-functional DNA elements, often called “molecular fossils,” offer powerful evidence of shared ancestry. Pseudogenes are former genes inactivated by mutation that remain in the genome and are passed down. If two species inherited the same non-functional pseudogene from a common ancestor, it appears in the exact same location in both genomes.
Endogenous retroviruses (ERVs) are another molecular fossil, representing ancient viral infections that integrated their genetic material into the germline cells of an ancestor. Humans and chimpanzees share the vast majority of their ERVs, with these remnants appearing at identical chromosomal locations. The probability of two separate lineages independently acquiring the same thousands of viral insertions in the identical spot is negligible.
Mutation and Genetic Variation
DNA is constantly subject to change, providing the raw material for evolutionary processes. Random mutations, errors occurring during DNA replication, are the ultimate source of all new genetic variation. These include point mutations (single nucleotide substitutions) or more significant changes like insertions or deletions.
While most mutations are neutral or harmful, a small fraction can be beneficial, providing an advantage. When a beneficial mutation occurs, the organism is more likely to survive and reproduce, passing the altered DNA sequence to the next generation through natural selection. The accumulation of these advantageous changes drives adaptation and the formation of new traits.
Gene duplication is an important type of mutation that fosters the evolution of complexity. An error during replication can copy an entire gene, leaving the organism with two identical copies. The original copy remains necessary for its existing function. The redundant second copy is released from selective pressure, allowing it to accumulate mutations without immediately harming the organism. This process, called neofunctionalization, allows the duplicate copy to acquire a novel sequence and potentially a completely new function, leading to the creation of new gene families and increasing biological complexity.
Measuring Evolutionary Time
The accumulation of mutations in the genome is used to estimate the time elapsed since two species shared a common ancestor, a technique known as the Molecular Clock. This concept relies on the observation that neutral mutations—those that do not affect survival or reproduction—accumulate in DNA at a relatively consistent rate over long periods. As species diverge from a shared ancestor, their DNA sequences drift apart due to these steady changes.
Scientists compare the number of differences in a specific, conserved gene sequence (such as ribosomal RNA or mitochondrial DNA) between two species. By knowing the average mutation rate for that gene region, researchers calculate the time of divergence. A greater number of sequence differences translates directly to a longer period of independent evolution. This molecular dating often aligns closely with timelines derived from the fossil record and geological methods, providing strong corroboration for evolutionary history. The Molecular Clock allows for the estimation of divergence times even for groups where the fossil record is sparse.