Which Bases Are Found in a Strand of DNA?

A single strand of DNA contains four nitrogenous bases: adenine (A), thymine (T), cytosine (C), and guanine (G). These four molecules, arranged in varying sequences along the sugar-phosphate backbone, encode all the genetic information in every living organism. The human genome alone strings together roughly 3 billion of these bases per strand.

The Four Bases and Their Two Families

The four DNA bases fall into two chemical families based on their shape. Adenine and guanine are purines, built from a double-ring structure (a six-membered ring fused to a five-membered ring). Cytosine and thymine are pyrimidines, which have only a single six-membered ring. This size difference matters: when the two strands of DNA pair up, a large purine always sits across from a smaller pyrimidine, keeping the double helix a uniform width.

How the Bases Pair Up

The bases on one strand don’t connect randomly to the bases on the opposite strand. Adenine always pairs with thymine, and cytosine always pairs with guanine. This is called complementary base pairing. So if one strand reads 5′-AATTGGCC-3′, the opposite strand must read 3′-TTAACCGG-5′.

What holds these pairs together are hydrogen bonds, a type of weak chemical attraction. An A-T pair is held by two hydrogen bonds, while a G-C pair is held by three. That extra bond makes G-C pairs slightly more stable. DNA regions rich in G-C pairs require more energy (higher temperatures) to pull apart than regions loaded with A-T pairs.

In 1950, biochemist Erwin Chargaff published data showing that in any organism’s DNA, the amount of adenine equals the amount of thymine, and the amount of guanine equals the amount of cytosine. These ratios, now called Chargaff’s rules, were a crucial clue that helped Watson and Crick figure out the double-helix structure three years later.

How DNA Bases Differ From RNA Bases

RNA uses three of the same bases as DNA: adenine, cytosine, and guanine. The difference is the fourth base. Where DNA has thymine, RNA uses uracil (U). Structurally, the swap is minor. Uracil is identical to thymine except it lacks a small chemical group (a methyl group) on its ring. But this distinction is consistent: if you see thymine, you’re looking at DNA; if you see uracil, it’s RNA.

The sugar in the backbone differs too. DNA uses deoxyribose (missing one oxygen atom), while RNA uses ribose. These two differences, thymine versus uracil and deoxyribose versus ribose, are the defining chemical distinctions between the two nucleic acids.

Beyond the Standard Four

While textbooks list four bases, cells can chemically modify them after DNA is built. The most common modification is methylation, where a small chemical tag gets added to cytosine, creating what’s sometimes called the “fifth base” of DNA (5-methylcytosine). More than 4% of all cytosines in the human genome carry this tag, and over 80% of cytosines that sit next to a guanine are methylated.

These modifications don’t change the genetic code itself. Instead, they act as a dimmer switch for gene activity. A stretch of heavily methylated DNA typically has its genes turned down or off, while unmethylated regions tend to be actively read. Adenine can also be modified in a similar way, though this is less common in humans. These chemical tweaks to the standard four bases are a central part of epigenetics, the study of how gene activity changes without altering the underlying DNA sequence.

What the Base Sequence Actually Encodes

The order of A, T, C, and G along a DNA strand is what carries meaning. Cells read these bases in groups of three (called codons), with each triplet specifying a particular amino acid during protein construction. Since there are four possible bases at each position and three positions per codon, the system can produce 64 different combinations, more than enough to code for the 20 amino acids that make up proteins.

The complete human genome contains about 3 billion base pairs per copy, packed into 23 chromosomes. The finished human reference genome, completed in 2022 by the Telomere-to-Telomere Consortium, finally mapped all of this sequence without gaps, filling in roughly 200 million bases that earlier efforts had left unresolved. A full diploid genome (the two copies you carry, one from each parent) spans about 6 billion base pairs total, all built from the same four molecular letters.