What Defines an Amino Acid as Canonical?

Amino acids are organic compounds that serve as the primary building blocks of proteins. Each one consists of a central carbon atom bonded to an amino group, a carboxyl group, and a unique side chain, or R-group. While over 500 amino acids exist in nature, a specific set of 22 is designated as “canonical” because they are directly encoded by the genetic material of organisms to be assembled into proteins. This select group forms the alphabet that cells use to construct the vast protein structures responsible for countless biological functions.

Genetic Encoding of Amino Acids

The canonical status of an amino acid is determined by an organism’s genetic code. This code, a set of rules stored in DNA, is transcribed into a messenger RNA (mRNA) molecule. During a process called translation, cellular machinery reads the sequence of nucleotide bases on the mRNA to build a protein.

The mRNA sequence is read in groups of three, called codons. With four different nucleotide bases (adenine, guanine, cytosine, and uracil in RNA), there are 64 possible codon combinations. Most of these codons specify one of the 20 standard amino acids, while a few are reserved as “stop” signals that mark the end of a protein chain.

Translation occurs within ribosomes. Transfer RNA (tRNA) acts as the link between the mRNA codon and the amino acid it specifies. Each tRNA molecule has a three-base anticodon that recognizes a specific mRNA codon and carries the corresponding amino acid to the ribosome to be assembled.

The Standard Set of Canonical Amino Acids

The 20 standard canonical amino acids are grouped based on the chemical properties of their side chains. These properties, which determine how a protein folds into its shape, are classified as nonpolar, polar, acidic, or basic. This chemical diversity allows for the variety of protein structures and functions found in nature.

Nonpolar, or hydrophobic, amino acids have side chains repelled by water. In water-soluble proteins, these amino acids are found buried in the protein’s core, which helps stabilize its structure. They include:

  • Glycine
  • Alanine
  • Valine
  • Leucine
  • Isoleucine
  • Proline
  • Phenylalanine
  • Methionine
  • Tryptophan

Polar, or hydrophilic, amino acids have side chains that interact with water. Their ability to form hydrogen bonds makes them common on the exterior surfaces of proteins. This group includes:

  • Serine
  • Threonine
  • Asparagine
  • Glutamine
  • Cysteine

A final classification involves electrically charged side chains at physiological pH. Aspartic acid and glutamic acid have acidic side chains that are negatively charged, while lysine, arginine, and histidine have basic side chains that carry a positive charge. These charged amino acids are important for forming salt bridges within proteins and are involved in the active sites of enzymes.

Nutritional Requirements for Amino Acids

From a human health perspective, the 20 canonical amino acids are categorized based on whether the body can produce them. This classification divides them into essential, non-essential, and conditionally essential amino acids, highlighting the dietary need for those our cells cannot manufacture.

Nine amino acids are essential, meaning they must be obtained from food. A diet lacking any of these will prevent the body from properly synthesizing proteins for growth and repair. The essential amino acids are:

  • Histidine
  • Isoleucine
  • Leucine
  • Lysine
  • Methionine
  • Phenylalanine
  • Threonine
  • Tryptophan
  • Valine

Non-essential amino acids can be synthesized by the human body. However, some are considered conditionally essential, meaning during infancy, illness, or stress, the body may not produce enough, requiring them from diet. These include:

  • Arginine
  • Cysteine
  • Glutamine
  • Tyrosine

Exceptions to the Standard Set

While the standard set includes 20 amino acids, two more—selenocysteine and pyrrolysine—bring the total to 22 in some organisms. They are considered canonical because they are incorporated into proteins during translation. Their inclusion relies on unique cellular machinery that reinterprets existing genetic signals.

Selenocysteine, the 21st amino acid, is found in organisms across all domains of life, including humans. Its incorporation requires a specific mRNA sequence that directs the ribosome to insert it at a UGA codon, which normally functions as a stop signal. Pyrrolysine, the 22nd amino acid, is found in some microbes and is similarly encoded by a UAG stop codon.

This direct encoding distinguishes them from non-canonical amino acids. Other molecules like gamma-aminobutyric acid (GABA) or ornithine have biological functions but are not built into proteins via the ribosome, which is the key difference.

p+q=1: The Role of Allele Frequencies in Population Genetics

What Are HLA Haplotypes and Why Do They Matter?

What Are HLA Alleles and Their Role in Health?