RCTD Deconvolution for Mapping Cell Types in Tissues

Understanding the arrangement of different cell types within a tissue is fundamental to biology. This cellular architecture governs how tissues function, develop, and respond to disease. The organization of cells dictates the layered complexity of the brain and the intricate battlefield of a tumor and its environment. Technologies that map this organization provide powerful insights into these biological processes.

Spatial transcriptomics measures gene activity across a slice of tissue, creating a map showing which genes are active at different locations. A challenge with this method’s resolution is that a single measurement, or “spot,” often captures genetic material from a mixture of several cells. This process obscures the view of the individual cellular components.

This creates a puzzle where the signal from one location is a composite of unknown cellular contributors. To solve this, scientists computationally “unmix” these blended signals. Specialized methods have been developed to dissect these mixed measurements, allowing for a reconstruction of the tissue’s cellular landscape. These tools are important for translating raw gene expression maps into meaningful biological knowledge.

The Foundational Data Inputs

To reconstruct a tissue’s cellular map, two types of data are required. The first is the spatial transcriptomics (ST) data, which acts as a geographical map of the tissue containing a grid of coordinates. At each coordinate, or “spot,” the technology measures the expression levels of thousands of genes. As noted, these measurements are often composites of multiple cells, providing the “where” by showing which gene signals are present at each location.

The ST dataset contains the mixed-signal problem. For example, a single 55-micrometer spot can encompass several smaller cells, like immune cells. The resulting data from that spot is a single list of gene expression values representing the average of all captured cells, masking the contributions of each cell type.

The second dataset is from single-cell RNA sequencing (scRNA-seq), which serves as a “reference atlas” or “cell type dictionary.” Unlike ST, scRNA-seq analyzes cells one by one. By isolating individual cells from a similar tissue, it generates a high-resolution gene expression profile, or “signature,” for each cell type. This process yields a catalog detailing the genetic fingerprint of a neuron, skin cell, or immune cell.

This reference atlas provides the “what,” containing the pure, unmixed expression profiles for every cell type. For instance, the scRNA-seq data would show that one cell type has high expression of Gene A and low of Gene B, while another has the opposite pattern. This dictionary of signatures equips scientists with the information to tease apart the mixed signals in the spatial data.

The RCTD Deconvolution Process

With the spatial “map” and single-cell “dictionary” available, the deconvolution can begin using Robust Cell Type Decomposition (RCTD). The goal of RCTD is to use the pure cell-type signatures from the scRNA-seq data. It estimates the proportion of each cell type contributing to every mixed-signal spot on the spatial map.

The process is analogous to figuring out a smoothie’s recipe after it has been blended. If you know the taste profile of a pure strawberry, banana, and blueberry (the scRNA-seq reference), you can taste the final smoothie (the ST spot) and estimate its composition. RCTD performs a similar feat but uses gene expression signatures. It calculates the most likely combination of cell types that would produce the observed genetic mixture at each spot.

A feature of this method is its robustness. The scRNA-seq and spatial transcriptomics data are generated using different techniques, which can introduce technical variations, or “batch effects.” These variations can distort the gene expression signals, making it difficult to compare the two datasets. RCTD is designed to correct for these discrepancies, ensuring accurate comparisons and reliable cell type predictions.

The algorithm uses a statistical model that tests different combinations and proportions of the reference cell types for each spatial spot. It identifies the combination that best explains the spot’s measured gene expression profile, while accounting for potential noise and technical differences. The model can be adjusted for data resolution, assigning fewer or more cell types per spot as needed.

Interpreting and Visualizing Results

The output of the RCTD process is a data table that quantifies the cellular makeup of the tissue. For every spot on the spatial map, the table provides an estimated percentage of each cell type it contains. For example, a row might read: Spot_735: 45% Cancer Cell, 35% T-cell, 15% Macrophage, 5% Fibroblast. This provides a quantitative breakdown of the tissue’s cellular composition.

This spreadsheet is the foundation for visualization. Scientists use this proportional data to generate new, color-coded images of the original tissue slice. Instead of a map showing gene expression levels, they can create a map that visualizes the cellular architecture. Each spot on the tissue image can be colored by its most dominant cell type, revealing the spatial organization of different cell populations.

For a more nuanced view, spots can be colored by blending hues based on the mixture of cell types they contain. A spot with both neurons and supportive glial cells could be colored with a blend of yellow and blue, for instance. This transforms the numerical data into an intuitive visual map. It allows researchers to see how different cell types are organized, where they form clusters, and how they define anatomical structures.

Applications in Biological Research

Mapping cell types within tissues has applications for many areas of biology and medicine. In oncology, RCTD is used to study the tumor microenvironment. Researchers can map the spatial arrangement of immune cells, blood vessels, and connective tissue cells in and around a tumor. This can reveal how cancer cells interact with and evade the body’s immune defenses, providing insights for more effective immunotherapies.

In neuroscience, RCTD is used to dissect the organized structures of the brain. The cerebral cortex, for instance, is composed of distinct layers with a specific mix of neuronal and glial cell types. By applying deconvolution methods to spatial transcriptomics data of the brain, researchers can delineate these layers. This allows them to study how cellular organization differs between healthy brains and those affected by diseases like Alzheimer’s.

The method also applies to developmental biology, tracking how cell types organize to form organs and tissues as an embryo develops. By analyzing tissue at different developmental stages, scientists can create an atlas of cell migration and differentiation. Understanding this process is important for learning how birth defects occur and how tissues might be regenerated for therapeutic purposes. This mapping of cellular geography provides a deeper understanding of complex biological systems.