A Comparison of Next Generation Sequencing Platforms

Next-generation sequencing (NGS) technologies have reshaped biological research by enabling the rapid and cost-effective reading of DNA and RNA sequences. The ability to sequence millions of DNA fragments in parallel has accelerated discoveries in medicine, agriculture, and evolution. By providing a high-resolution view of the genetic code, NGS is a widespread tool for investigating rare genetic disorders and the human microbiome.

The Core Principle of Next Generation Sequencing

The process for most NGS platforms begins with library preparation, where long strands of DNA or RNA are fragmented into shorter pieces. Special DNA sequences called adapters are attached to the ends of these fragments. These adapters act like handles, allowing the fragments to bind to the sequencing instrument’s surface for the next steps.

Following library preparation is amplification. The DNA fragments are loaded onto a surface, often a flow cell, and copied many times. This creates millions of distinct clusters, with each cluster containing identical copies of an original fragment. This process generates a strong enough signal from each fragment to be detected by the instrument’s imaging systems.

The next step is sequencing, where the DNA bases—adenine (A), cytosine (C), guanine (G), and thymine (T)—are read. This is done through massively parallel sequencing, where millions of fragments are processed simultaneously. This parallel approach is a significant departure from older methods that could only read one DNA fragment at a time, giving NGS its high speed.

Finally, the data undergoes analysis. Computers use software to piece the millions of short sequence reads back together, similar to reassembling a shredded book. The result is a comprehensive view of the original DNA or RNA sequence, which can be analyzed to identify genetic variations or measure gene activity.

Short-Read Sequencing Platforms

The dominant short-read technology is Sequencing by Synthesis (SBS), used by Illumina platforms. This approach uses fluorescently tagged nucleotides, the building blocks of DNA. Each of the four bases (A, C, G, T) is labeled with a unique color and a reversible terminator that temporarily halts DNA synthesis, allowing the instrument to read one base at a time.

The process occurs on a flow cell where DNA fragments have been amplified into dense clusters. In each sequencing cycle, fluorescently labeled nucleotides are washed over the flow cell, and a DNA polymerase adds the corresponding nucleotide to the growing DNA strand in each cluster. After a nucleotide is added, the process is paused, and a camera images the flow cell.

The color of the fluorescence in each cluster indicates which base was added. Once the image is captured, the fluorescent tag and terminator are removed, allowing the next cycle to begin. This method produces highly accurate data, but the individual DNA reads are relatively short, ranging from 50 to 300 bases.

Long-Read Sequencing Platforms

Long-read sequencing platforms read much longer continuous stretches of DNA. Two prominent technologies are Pacific Biosciences’ (PacBio) Single-Molecule, Real-Time (SMRT) sequencing and Oxford Nanopore Technologies’ (ONT) nanopore sequencing. These methods generate reads that can be thousands or even millions of bases long.

PacBio’s SMRT sequencing observes DNA synthesis in real time. Its SMRT Cell chip contains thousands of microscopic wells where a single DNA polymerase enzyme works on a single DNA molecule. As the polymerase incorporates fluorescently labeled nucleotides, a detector records the light pulses emitted to identify each base, all without the amplification step used in short-read technologies.

Oxford Nanopore Technologies uses a different approach. A single strand of DNA is passed through an engineered protein pore, or nanopore, embedded in a membrane. As each DNA base moves through the pore, it causes a characteristic disruption in an electrical current, which is measured to determine the DNA sequence.

The main advantage of these platforms is their ability to produce very long reads. This is valuable for assembling a genome from scratch or identifying large-scale structural changes in DNA. While historically having higher error rates and costs, continuous improvements are making them more accurate and cost-effective.

Choosing the Right Platform for the Research Question

The choice between short-read and long-read sequencing depends on the scientific question. Each technology is suited for different research applications, so the decision is about which platform provides the most effective data for a specific goal.

Short-read sequencing is effective for applications that involve counting molecules or identifying small genetic variations against an established reference genome. For example, researchers use short reads to quantify gene expression levels or to identify single nucleotide polymorphisms (SNPs). These small, single-base changes in DNA can be associated with diseases.

Long-read sequencing is the preferred choice when understanding the genome’s overall structure is the goal. These technologies are powerful for de novo genome assembly—sequencing a genome for the first time without a reference map. The long reads make it easier to piece together complex and repetitive regions of a genome.

Long-read sequencing is also well-suited for detecting large structural variations like inversions or deletions of DNA segments, which can be missed by short-read methods. Because such rearrangements are often implicated in genetic diseases and cancer, a researcher might choose long-read sequencing to get a complete picture of the genome’s architecture.