Whole Genome Sequencing (WGS) is a powerful laboratory process that reads and analyzes the entire genetic instruction set of an organism. This process captures the sequence of all the DNA present in a cell, providing a complete view of the genetic blueprint. Unlike more limited methods that focus only on specific parts of the genome, WGS captures both the nuclear DNA, which is organized into 23 pairs of chromosomes, and the smaller mitochondrial DNA. The goal of this comprehensive approach is to catalog every single difference between an individual’s genome and a standard reference genome. By exploring this vast amount of data, researchers and clinicians can identify the full range of genetic anomalies that may influence health and disease.
Pinpointing Single Nucleotide Variants and Small Indels
The smallest and most numerous types of variation that Whole Genome Sequencing can detect are Single Nucleotide Variants (SNVs). An SNV occurs when a single base pair in the DNA sequence, such as an Adenine (A), is swapped for a different base, like a Guanine (G). These tiny point mutations are the most common form of genetic difference found between any two individuals.
WGS is also highly effective at identifying small insertions and deletions, often referred to as Indels. An Indel is the addition or removal of a small segment of DNA, typically ranging from one base pair up to about 50 base pairs in length. When an Indel occurs within a gene’s coding region, it can shift the entire reading frame, leading to a non-functional protein and often resulting in a Mendelian disorder.
The high-resolution reading provided by WGS ensures that these minute changes do not go unnoticed. Because these small variants are the underlying cause of many inherited conditions, their accurate identification is foundational to genetic diagnosis. Without the ability to precisely map these substitutions and small length changes, many single-gene disorders would remain genetically unexplained.
Mapping Large Structural and Copy Number Changes
Moving beyond single-base changes, WGS is uniquely suited to map large-scale alterations in the genome known as Structural Variants (SVs). Structural variants are DNA rearrangements that span more than 50 base pairs, often involving thousands or even millions of base pairs. These include complex rearrangements such as large deletions, duplications, inversions, and translocations, where segments of DNA are flipped, moved, or lost entirely.
The full sequencing of the entire genome provides the necessary context and resolution to accurately identify the breakpoints of these large rearrangements. Traditional methods often miss these complex changes, particularly balanced translocations and inversions where no net gain or loss of DNA occurs. WGS, however, can detect the precise junctions where the DNA sequence is incorrectly joined, regardless of whether the overall DNA quantity has changed.
A specific and clinically significant type of structural variant is the Copy Number Variation (CNV), which involves the duplication or deletion of significant stretches of the genome. WGS detects CNVs by analyzing the depth of coverage—the number of times a specific region of the genome is sequenced—allowing it to spot regions with an abnormal number of copies, which can lead to diseases like developmental disorders. Although SNVs are more numerous, SVs and CNVs collectively affect a greater total number of bases in the human genome, making their comprehensive detection by WGS important for understanding genetic diversity and disease etiology.
Uncovering Variations in Non-Coding Regulatory Regions
One of the most significant advantages of WGS over more targeted sequencing methods, such as Whole Exome Sequencing (WES), is its ability to analyze the non-coding regions of the genome. The non-coding DNA comprises approximately 98% of the human genome and does not contain instructions for making proteins. Instead, this vast expanse is rich with regulatory elements that act as genetic switches, controlling when and where genes are turned on or off.
WGS allows for the detection of variants within these regulatory zones, which include elements like promoters and enhancers. Promoters are located near a gene and act as a starting point for gene activity. Enhancers can be located far away but physically loop around to boost a gene’s expression. A mutation in an enhancer, for example, might not change the protein sequence of a gene but could drastically reduce or increase the amount of protein produced, leading to disease.
The detection of these non-coding variants is important because a large proportion of disease-associated genetic changes identified in genome-wide studies fall outside of protein-coding regions. By reading the entire genome, WGS provides the opportunity to link a disease to a mutation that influences gene expression, rather than just a mutation that changes a protein’s structure. This capability is expanding our understanding of the underlying causes of complex conditions that do not have clear-cut coding region mutations.
Differentiating Inherited and Acquired Conditions
Whole Genome Sequencing provides the molecular detail necessary to distinguish between two fundamentally different categories of genetic change: inherited and acquired mutations. Inherited, or germline, mutations are passed down from a parent and are present in every cell of an individual’s body from conception. These variants are typically associated with hereditary diseases or predispositions to certain conditions.
In contrast, acquired, or somatic, mutations arise during an individual’s lifetime and are often confined to specific tissues or cells, such as those found in a tumor. Somatic mutations are not inherited and cannot be passed on to offspring. WGS is used extensively in cancer analysis to identify these acquired mutations, which drive tumor growth and progression.
To accurately pinpoint the acquired somatic changes, WGS is performed on both the tumor tissue and a sample of the patient’s healthy tissue, known as a matched normal. By comparing the two sequences, any variants present in the tumor but absent in the healthy sample are identified as acquired somatic mutations. This comparison is a standard procedure that allows clinicians to identify specific mutations that can be targeted with precision therapies.