What Is a Methylation Array and How Does It Work?

A methylation array is a tool used in biology and medicine to study epigenetics—modifications to DNA that do not change the genetic sequence but can alter how genes are expressed. The technology measures a specific epigenetic mark, DNA methylation, at hundreds of thousands of distinct points across the genome simultaneously. This high-throughput capability makes it an effective instrument for analyzing how these modifications influence health and disease.

The Science of DNA Methylation

DNA methylation is the addition of a methyl group to a DNA molecule. In mammals, this modification happens where a cytosine nucleotide is followed by a guanine nucleotide, a location known as a CpG site. These sites are widespread and often cluster in regions called CpG islands, which are frequently located near the start of genes.

This addition acts like a dimmer switch for genes. When a CpG island in a gene’s promoter region becomes heavily methylated, it can “turn off” the gene’s activity. An absence of methylation in these areas is associated with active gene expression. This regulatory mechanism guides processes from embryonic development to the differentiation of specialized cells.

The placement and removal of these methyl tags are managed by enzymes. DNA methyltransferases (DNMTs) are responsible for establishing and maintaining methylation patterns, while TET enzymes can initiate their removal. This dynamic system allows cells to respond to their environment and maintain stable gene expression profiles.

How a Methylation Array Works

The function of a methylation array combines a chemical treatment with high-density microarray technology. The process begins with collecting a DNA sample from blood or tissue. This DNA is treated with sodium bisulfite, which chemically alters unmethylated cytosine bases into uracil while leaving methylated cytosines untouched. This bisulfite conversion translates the epigenetic information into a detectable change in the DNA sequence.

After the chemical conversion, the altered DNA is amplified to create many copies. This product is then washed over the surface of the methylation array, a small glass chip. The chip’s surface is covered with hundreds of thousands of microscopic beads, with newer versions having over 900,000. Each bead holds copies of a specific probe, a short DNA sequence designed to bind to a particular CpG site.

Once the sample DNA binds to the probes, the results are read. The array is scanned with lasers, causing fluorescent dyes to light up. The system uses different colored dyes to distinguish between signals from methylated and unmethylated DNA at each site. For example, a green signal might indicate a methylated cytosine was present, while a red signal indicates an unmethylated one. The scanner captures the intensity of these signals for each probe, generating a raw data file that maps the methylation status across the genome.

Interpreting Methylation Array Data

After the array is scanned, researchers have a large data file that must be processed with specialized software to normalize signals and ensure quality. The primary output for each CpG site is a “beta-value,” a score from 0 to 1. This value represents the proportion of methylation at that location. A beta-value of 0 indicates no methylation, while 1 signifies complete methylation; a value of 0.75 means the site was methylated on 75% of the DNA strands.

With beta-values calculated for every site, scientists begin the biological interpretation. A primary goal is to compare methylation patterns between different groups, such as healthy versus tumor tissue. Using statistical analyses, they look for “differentially methylated regions” (DMRs). A DMR is a stretch of the genome with a consistent and significant difference in methylation levels between the compared groups.

Identifying these DMRs helps link epigenetic changes to specific traits or diseases. For instance, finding a gene’s promoter is hypermethylated in cancer samples suggests that silencing this gene could be involved in the cancer’s development. This process turns millions of data points from the array into biological insights.

Applications in Research and Medicine

The ability to profile DNA methylation has led to applications in biomedical research and clinical practice, particularly in oncology. Many types of cancer exhibit distinct methylation signatures that differ from healthy tissue. These patterns can be used as biomarkers for early disease detection via liquid biopsy, to predict a patient’s prognosis, or to determine their likely response to a therapy.

Another application is the development of “epigenetic clocks.” Researchers found that methylation levels at certain CpG sites change predictably with age. By analyzing the methylation status of these sites, an algorithm can estimate an individual’s biological age. This predicted age may differ from a person’s chronological age, and a higher biological age is linked to increased health risks and mortality.

Beyond cancer and aging, methylation arrays help explain how environment and lifestyle impact our genome. Studies use this technology to investigate how factors like diet, pollution, or stress can leave long-lasting epigenetic marks. These marks can influence gene expression and affect health outcomes, providing a molecular link between our experiences and biology.

Comparison with Other Technologies

Methylation arrays are one of several technologies used to study the methylome. The primary alternative is Whole-Genome Bisulfite Sequencing (WGBS), often considered the gold standard. Like arrays, WGBS uses bisulfite treatment to distinguish between methylated and unmethylated cytosines. However, WGBS sequences the entire genome, providing a complete, base-by-base map of methylation.

The choice between an array and WGBS involves a trade-off between focused analysis and comprehensive discovery. An array measures methylation only at the specific CpG sites on its probes, which cover well-known genes and promoters. This makes arrays cost-effective and computationally less intensive, an advantage for large-scale studies with thousands of samples.

In contrast, WGBS offers a complete view of the methylome, allowing for the discovery of changes in novel regions not covered by an array. This makes it better for discovery-oriented research. The downside is that WGBS is more expensive and produces large data files requiring significant computational power to analyze. Therefore, arrays are well-suited for high-throughput, hypothesis-driven research and large-scale screening.