Oxford Nanopore Protein Sequencing: A New Frontier

The world of biological research is advancing with technology that reads the amino acid sequence of a single protein molecule directly. This field, centered on Oxford Nanopore’s protein sequencing platform, extends their established DNA and RNA sequencing capabilities. This innovation addresses the goals of proteomics, the large-scale study of proteins, which are the primary functional molecules in every living cell. For decades, proteomic methods have been indirect, but this new approach observes individual protein molecules, offering a direct window into the proteome that could reshape our understanding of biology.

## How Nanopore Sensing Works

The principle behind Oxford Nanopore’s technology relies on a nanoscale pore embedded within a synthetic, electrically resistant membrane. This membrane separates two compartments filled with an ionic solution. When a voltage is applied, a steady stream of ions flows through the pore, creating a measurable electrical current. This setup acts as a highly sensitive, single-molecule detector.

When a long polymer, such as an unfolded protein, is guided into this pore, it physically obstructs the opening. This blockage causes a characteristic disruption in the ion flow, which is recorded as a fluctuation in the electrical current. Each amino acid has a unique size and chemical property, causing a distinct change in the current. This stream of electrical fluctuations, called a “squiggle,” provides the raw data for decoding the molecule’s sequence.

The process can be visualized as pulling a single, complex thread through a very narrow eyelet. As different sections of the thread pass through—some thicker, some thinner, some with different textures—they alter how easily air can flow through the opening. By measuring these changes in airflow, one could theoretically reconstruct the properties of the thread along its length. In a similar way, the nanopore sensor “feels” the molecule as it moves through, translating physical structure into a digital signal.

## The Challenge of Sequencing Proteins

Applying nanopore sensing to proteins is more complex than sequencing DNA. The first challenge is the alphabet itself; proteins are built from 20 common amino acids, whereas DNA has only four bases. This increased complexity demands a sensor with much higher resolution to generate distinct electrical signals for each amino acid, especially since many have very similar sizes and charges.

A second challenge is that proteins do not exist as simple linear chains inside cells, instead folding into intricate three-dimensional structures too large to pass through a nanopore. Before sequencing can begin, these proteins must be completely unfolded, or denatured, into a straight polypeptide strand. This process adds a preparatory step and requires methods to keep the protein from refolding as it approaches the pore.

The movement of this unfolded chain through the pore must also be precisely controlled. Unlike DNA, which has a uniformly negative charge, proteins have a heterogeneous charge, making an electrical field an unreliable method for translocation. To solve this, researchers use a “motor protein” that grabs onto the polypeptide and ratchets it through the pore at a slow, measurable pace, allowing the sensor enough time to register the signal from each passing amino acid.

Finally, resolving the resulting signal is a complex task. The electrical squiggle generated by a protein is influenced by a group of about 20 amino acids occupying the pore at any given moment, making it difficult to pinpoint the signal of a single amino acid. Advanced machine learning algorithms are being developed to deconstruct this complex signal, creating a unique “fingerprint” for a protein sequence by analyzing the collective pattern of current changes.

## A Step-by-Step Workflow

For a researcher, the journey from a biological sample to a protein sequence follows a structured workflow:

Sample Preparation: Proteins of interest are isolated and purified from their complex native environment, such as blood or tissue. This step is designed to remove contaminants and concentrate the target proteins to ensure a clean signal during the sequencing run.
Library Preparation: In this multi-step process, the purified proteins are denatured to unfold them into linear polypeptide chains. Following denaturation, adapter molecules and a motor protein are chemically attached to the ends of the protein strands to guide them to the nanopore.
Sequencing: The prepared library is loaded onto a consumable flow cell containing the nanopore array and inserted into a sequencing device. As each protein molecule is threaded through a nanopore, the device records the resulting disruptions in the electrical current in real time.
Data Analysis: The raw electrical signal, or “squiggle” data, is processed by basecalling algorithms. These computational tools translate the complex current fluctuations into a corresponding sequence of amino acids by comparing the experimental signal patterns to known models.

## Differentiating from Traditional Methods

Nanopore protein sequencing stands in contrast to mass spectrometry (MS), the long-standing method in proteomics. Mass spectrometry works indirectly by first breaking down proteins into smaller fragments called peptides. These peptides are then identified based on their mass-to-charge ratio, and the full protein sequence is computationally reassembled from these pieces.

Nanopore sequencing, on the other hand, is a direct method that reads a single, full-length protein molecule as it passes through the sensor. This “long-read” capability avoids the computational challenge of reconstructing a sequence from fragmented data. This provides a clearer picture of the protein as it exists in the cell.

Another difference lies in analyzing post-translational modifications (PTMs), which are chemical changes to amino acids that alter protein function. While MS can detect PTMs, the fragmentation process can make it difficult to determine their exact location on the full-length protein. Nanopore technology can read these modifications in their native context on an intact protein. The portability of nanopore devices also contrasts with the large, stationary equipment required for mass spectrometry.

## Potential Applications in Science and Medicine

The ability to directly sequence single protein molecules opens up numerous possibilities across scientific research and medicine. One application is in biomarker discovery. Researchers can analyze proteins in patient samples like blood or tissue to identify specific protein variants or modifications associated with diseases such as cancer or neurodegenerative disorders.

In drug development, nanopore sequencing can be used to analyze how a therapeutic agent interacts with its target protein at a single-molecule level. It also aids quality control in the manufacturing of protein-based drugs, such as monoclonal antibodies, by allowing for precise verification of their sequence and purity.

Beyond clinical applications, the technology could transform fundamental biology. It enables a deeper exploration of “epiproteomics”—the vast landscape of post-translational modifications that regulate cellular function. Because direct sequencing can map these modifications, it provides the tools needed to understand this complex layer of biological regulation. The technology may even be used to create protein-based data storage systems.