What Are OTUs? A Key Tool for Classifying Microbes

An Operational Taxonomic Unit, or OTU, is a method for grouping microorganisms based on their genetic similarity. This system allows researchers to analyze and count different types of microbes in a sample by looking at their genetic makeup. It serves as a functional classification when traditional identification is not possible.

The Need for a New Classification Method

For many years, microbiologists studied bacteria by growing them in a laboratory, a method known as culturing. A significant problem is that most microorganisms found in natural environments cannot be grown under lab conditions.

This issue is known as “the great plate count anomaly.” For many environmental samples, such as soil or seawater, only about 1% of the bacteria observed microscopically can be cultured. This reveals a large discrepancy between microbial cells seen under a microscope and colonies that grow on a plate.

The remaining 99% are often referred to as “unculturable.” This inability to grow them is not because they are dead; many are metabolically active in their natural habitats.

Some microbes have extremely specific growth requirements that are difficult to replicate in a lab, including unique nutrients, pH levels, or oxygen concentrations. Others may exist in a state of dormancy in nature or depend on chemical signals from other microbes within their community to grow. This large number of unculturable organisms necessitated the development of culture-independent techniques to study microbial diversity.

How OTUs Are Created

The creation of OTUs bypasses culturing by analyzing microbial DNA directly from an environmental sample. The process begins with scientists collecting a sample, extracting the total DNA, and focusing on a specific marker gene that acts as a genetic barcode. For bacteria and archaea, this barcode is the 16S rRNA gene.

This gene is used because it contains both highly conserved regions, which are similar across most bacteria, and variable regions, which differ between species. Scientists use a technique called Polymerase Chain Reaction (PCR) to make millions of copies of this 16S rRNA gene from the sample DNA. These copies are then sequenced to read their exact genetic code, resulting in a large dataset of different 16S rRNA sequences representing the various microbes present.

The final step is clustering these sequences into OTUs. The sequences are computationally grouped based on their similarity to one another. The most common standard used in microbiology is a 97% similarity threshold. This means any sequences that are 97% or more identical are binned together into a single OTU.

OTUs vs. Traditional Species

An OTU is a pragmatic proxy for a species, not a formal biological classification. The traditional system of naming species, known as Linnaean taxonomy, relies on a wide range of evidence, including an organism’s physical characteristics, metabolic capabilities, and ability to interbreed. An OTU, by contrast, is defined purely by a computational cutoff based on the sequence of a single gene.

The 97% similarity threshold is a widely used guideline intended to approximate the species level, but it is not a perfect match. A single OTU may sometimes encompass several very closely related but distinct species. For example, the bacteria Escherichia coli and Shigella have nearly identical 16S rRNA gene sequences and are often grouped into the same OTU, despite being classified as different genera with different clinical implications.

Conversely, a single recognized bacterial species might have enough variation in its 16S rRNA gene that its members could be split into multiple OTUs. The relationship between an OTU and a species is therefore an approximation. It provides a standardized unit for measuring microbial diversity based on a consistent genetic benchmark, even if that benchmark doesn’t perfectly align with traditional taxonomic ranks.

The Evolution to ASVs

In recent years, the field of microbiology has begun to shift from OTUs to a more precise method of classification: Amplicon Sequence Variants (ASVs). ASVs represent unique DNA sequences that have been corrected for errors introduced during the sequencing process. Unlike OTUs, ASVs do not involve a clustering step; every single unique sequence, down to a one-nucleotide difference, is treated as its own distinct variant.

This approach eliminates the 97% similarity cutoff. By resolving sequences at the single-nucleotide level, ASVs provide a much finer level of detail. This increased resolution allows researchers to distinguish between very closely related strains of bacteria that would be lumped together into one OTU. This can be particularly important for tracking the spread of a specific pathogenic strain or understanding subtle variations within a microbial community.

A major advantage of ASVs is their reproducibility and comparability across different studies. Since an ASV is an exact sequence, an ASV table from one study can be directly compared to another. OTUs, on the other hand, are dependent on the clustering algorithm and the specific dataset used, making direct comparisons between studies challenging. If OTUs are like sorting cars into general categories like “blue trucks,” ASVs are like recording the unique Vehicle Identification Number (VIN) of every vehicle.