CITE-seq, or Cellular Indexing of Transcriptomes and Epitopes by Sequencing, is a scientific method that allows researchers to analyze two different types of information from single cells simultaneously. It combines the comprehensive view of gene activity with the specific details of protein presence on a cell’s surface. This dual perspective provides a more complete understanding of cellular identity and function than traditional methods alone.
What CITE-seq Measures
CITE-seq provides two distinct layers of information from each individual cell. First, it captures the cell’s transcriptome, which is the complete set of messenger RNA (mRNA) molecules present at a given time. Measuring the transcriptome indicates which genes are actively being read and translated into RNA, offering insights into a cell’s potential functions. This gene activity can reveal a cell’s overall state and its response to various biological signals.
Second, CITE-seq quantifies the epitopes, which are specific parts of proteins, primarily those found on the cell’s surface. These surface proteins act like identity badges, indicating a cell’s current type, its maturity, or how it is interacting with its environment. Unlike RNA, proteins are the actual functional molecules that carry out most cellular tasks, and their presence on the surface is directly involved in cell-to-cell communication and recognition.
The CITE-seq Experimental Process
The CITE-seq workflow involves several precise steps to collect both gene and protein information from individual cells. The process begins by tagging specific proteins on the cell surface. This is achieved using antibodies, which are natural proteins designed to bind to unique target proteins. Each antibody is manufactured with a unique DNA barcode for later identification.
Antibodies are then introduced to a sample of cells, where they bind to their corresponding surface proteins. Unbound antibodies are then washed away to ensure accurate measurement and prevent interference.
Next, individual cells are isolated, often using microfluidic devices. These devices encapsulate each cell, along with reagents and a microscopic bead containing a unique DNA barcode, into its own tiny droplet. This partitioning ensures that all molecules originating from a single cell are kept together and assigned to that cell.
Inside these droplets, the cells are gently broken open, releasing their contents. Both the cell’s messenger RNA (mRNA) and the DNA barcodes from the bound antibodies are captured and simultaneously tagged with the cell-specific barcode from the bead. This links the RNA and protein data back to their original cell.
Finally, all the barcoded mRNA and antibody DNA sequences are collected and read by a high-throughput sequencing machine. This machine generates massive amounts of data, which are then processed to quantify both the gene expression levels and the surface protein abundance for each individual cell in the sample. The resulting data sets are then ready for computational analysis.
Integrating and Interpreting CITE-seq Data
Once the sequencing is complete, scientists receive two separate, but related, datasets for each individual cell: one detailing its gene expression from the RNA and another quantifying its protein levels from the antibody barcodes. The strength of CITE-seq lies in the computational methods used to combine and interpret these distinct data types. Specialized computer programs are employed for a process known as multimodal integration, which links the gene and protein information precisely to each single cell.
This integration is particularly valuable because gene expression (RNA) does not always perfectly correlate with protein levels, as proteins can be regulated at various stages after RNA production. For instance, cells might show similar gene activity but express different surface proteins, indicating subtle yet significant functional distinctions. The combined data helps researchers more accurately define cell types, identify rare cell populations, and uncover functional states that would be invisible if only one data type were analyzed. Software tools like Seurat or CiteFuse are used to normalize, integrate, and visualize this complex multi-modal data, providing a comprehensive understanding of cellular biology.
Applications in Scientific Research
CITE-seq has had a substantial impact across various fields of scientific research, enabling discoveries that were previously difficult to achieve. In immunology, for example, it helps create detailed maps of the immune system by identifying diverse immune cell types and understanding their specific roles in health and disease. This is useful in studying conditions like autoimmune disorders or how immune cells respond to infections, deepening the understanding of immune cell subsets and their functions. Researchers have used CITE-seq to identify novel T cell subsets involved in controlling viral replication and preventing tissue damage.
In cancer research, CITE-seq is applied to map the various cell types within a tumor, including both cancer cells and the surrounding immune cells. This helps researchers understand the complex heterogeneity of tumors and why some might resist treatment, paving the way for more targeted therapies. For instance, it has been used to categorize breast cancer cells based on their cellular composition and treatment response, offering a comprehensive understanding of the disease.
CITE-seq also contributes to developmental biology by tracing how stem cells differentiate into various specialized cell types during an organism’s development. By analyzing changes in both gene expression and surface proteins as cells mature, scientists can identify the specific genes and proteins that guide cell fate decisions. This understanding has implications for regenerative medicine and the development of cell-based therapies.