ModelAngelo represents an advancement in structural biology, offering a solution for automatically building atomic models. It transforms cryo-electron microscopy (cryo-EM) data into precise atomic structures of biological molecules. Automating this intricate process accelerates scientific discovery, aiding in understanding life’s fundamental mechanisms.
Understanding Cryo-Electron Microscopy
Cryo-electron microscopy, or cryo-EM, is a technique for visualizing biological molecules, such as proteins, at high resolutions. Samples are flash-frozen in vitreous ice, preserving their natural structures and preventing damaging ice crystals, which allows for detailed imaging.
An electron microscope fires electrons through the frozen sample, creating a two-dimensional image from their interaction with molecules. Thousands of these images, taken from different angles, are then used by computer algorithms to reconstruct a three-dimensional “density map.” This map shows the overall shape and density of the molecule, but it does not directly reveal the exact positions of individual atoms.
The Challenge of Building Atomic Models
Converting cryo-EM density maps into atomic models has historically been complex and time-consuming. The density map provides a fuzzy outline, and interpreting it to pinpoint every atom requires specialized expertise. This process was traditionally manual, with researchers meticulously fitting atomic structures into the electron density.
This manual approach was labor-intensive and susceptible to human error, especially with large protein complexes. Building a complete atomic model could take weeks or even months for a single structure. Its difficulty and slowness posed a bottleneck in structural biology research, limiting the number of new protein structures that could be determined.
How ModelAngelo Automates Structure Building
ModelAngelo automates atomic model building using artificial intelligence, specifically a graph neural network (GNN). It first uses a convolutional neural network (CNN) to identify individual amino acids within the cryo-EM density map, essentially creating an initial rough sketch of the protein’s backbone. This forms a graph where each amino acid is a node, and connections represent the protein chain.
The GNN refines this initial sketch by integrating several types of information. It considers the cryo-EM density data, the known protein sequence (if available), and established rules about protein geometry. The GNN iteratively adjusts amino acid positions and orientations, optimizing the fit to the cryo-EM map while adhering to chemical and structural principles. This refinement also classifies amino acid identity for each node, ensuring accurate representation.
ModelAngelo can also build models even when the protein sequence is unknown. It predicts probabilities for all twenty common amino acids for each residue position. These probabilities are then used in a hidden Markov model (HMM) search to identify the protein, even without prior sequence information. This capability is particularly useful for studying novel proteins or those from unsequenced organisms, greatly expanding the scope of cryo-EM studies.
Advancing Structural Biology
ModelAngelo’s automated approach accelerates structural biology research. By automating atomic model building, it allows scientists to determine protein structures faster than manual methods. This speed-up is especially valuable for high-throughput studies, where many protein structures need to be analyzed.
The software produces models of similar quality to human experts, and can outperform them in identifying proteins with unknown sequences, increasing the objectivity and efficiency of cryo-EM structure determination. This efficiency leads to faster discoveries in areas like drug development, where understanding a target protein’s precise atomic structure is foundational for designing new medicines. It also aids in unraveling disease mechanisms by revealing how molecular machines malfunction. ModelAngelo also makes cryo-EM more accessible, enabling the study of challenging biological systems previously difficult to analyze manually.