Google DeepMind’s AlphaFold program is a significant advance in predicting the three-dimensional structures of proteins from their amino acid sequences. This capability is now being adapted to interpret missense variants, which are genetic mutations that cause a single amino acid substitution in a protein. While modern technology can easily identify these variants, determining whether they are harmless or disease-causing remains a complex hurdle for medical genetics.
The Challenge of Interpreting Missense Variants
A missense mutation alters a protein’s building blocks. The functional consequence of replacing one amino acid with another depends on many factors, including the chemical properties of the amino acids and the specific location of the change within the protein’s structure. This variability makes interpreting the genetic alteration difficult.
In clinical genetics, variants are sorted into three categories: pathogenic (disease-causing), benign (harmless), and Variants of Uncertain Significance (VUS). The VUS category is the largest and most problematic group, as only about 2% of the four million identified human missense variations have been definitively classified. This leaves a vast number of variants without a clear clinical interpretation, creating a bottleneck in diagnosing genetic disorders.
The volume of VUS presents a formidable challenge. A VUS result offers no clear answers for families undergoing genetic testing, leaving them in diagnostic limbo. For researchers, the number of VUS associated with a condition makes it difficult to pinpoint the genetic cause, hindering the development of targeted therapies. Reclassifying these variants is a slow process that requires extensive experimental validation for each one.
How AlphaFold Predicts Missense Effects
AlphaFold’s primary function is to predict a protein’s static 3D shape from its amino acid sequence. This capability was extended to assess missense mutations through a specialized tool called AlphaMissense. Based on the AlphaFold2 model, this separate system evaluates how a specific amino acid substitution might disrupt a protein’s stability or function.
The AlphaMissense model generates a score predicting a variant’s likelihood of being pathogenic. This score is derived from an analysis that combines two main data types. First, it examines the evolutionary conservation of the amino acid by comparing sequences across different species. A residue that remains unchanged over millions of years is presumed important, meaning a change there is more likely to be disruptive.
The model also weighs the structural context of the amino acid. It analyzes the residue’s predicted location within the protein’s folded structure, such as on the surface, in the core, or at an active site. By integrating structural information with evolutionary data and patterns from human and primate variant frequencies, AlphaMissense calculates a final pathogenicity score. A high score suggests the substitution is likely damaging, while a low score indicates it is probably tolerable and benign.
This process allows for a systematic evaluation of missense variants in the human proteome. In one application, AlphaMissense classified 32% of all possible single amino acid substitutions as likely pathogenic and 57% as likely benign. This catalog provides a resource for prioritizing variants for further study and sifting through millions of unclassified mutations.
Applications in Disease Research and Diagnostics
A direct application of AlphaFold’s missense prediction is the clinical reclassification of VUS. A VUS result from a genetic test is often inconclusive for patients with rare diseases. AlphaMissense offers a predictive score that helps clinicians determine if a VUS is likely pathogenic, which can lead to a definitive diagnosis and end a long diagnostic odyssey.
In scientific research, the tool helps scientists prioritize their efforts. Researchers studying a disease may identify thousands of missense variants in associated genes, but experimentally testing each one is impractical. AlphaMissense allows them to rank variants by predicted pathogenicity, focusing lab work on the mutations most likely to be causal. This accelerates understanding of disease mechanisms and identification of therapeutic targets.
The predictive power of AlphaMissense can also aid in discovering new links between genes and diseases. By analyzing variants across the proteome, researchers can identify genes where pathogenic mutations appear concentrated, suggesting a role for that gene in a disorder. This can open new avenues of investigation and improve the diagnostic yield for complex genetic illnesses.
Limitations and Future Perspective
AlphaFold and AlphaMissense are predictive tools, not definitive arbiters of biological function. Their scores are probabilities, not certainties, and require experimental validation before use in clinical decision-making. The models are powerful but have limitations based on their training data and the biological mechanisms they evaluate.
A primary limitation is the focus on how a mutation affects a protein’s static structure and stability. The models may not accurately predict consequences of mutations that alter protein dynamics, such as how a protein moves to perform its function. The system also cannot assess variants affecting protein quantity or those that disrupt interactions between proteins or with molecules like DNA and RNA.
Despite these constraints, the technology is rapidly evolving. Future versions will likely incorporate new data types for more comprehensive prediction models. Integrating AlphaMissense scores with clinical information, patient symptoms, and other experimental data will enhance their diagnostic utility. The improvement of these tools promises to make genetic information more actionable for researchers and clinicians.