An artificial intelligence (AI) program has successfully cracked a 50-year problem in biology: How to accurately determine the three-dimensional structure of protein folding from the protein's amino acid sequence.
An artificial intelligence (AI) program has successfully cracked a 50-year problem in biology: How to accurately determine the unique three-dimensional structure of protein folding from the protein's amino acid sequence.
The breakthrough was made by AlphaFold, an AI program developed by DeepMind, Google's AI spinoff company. It was announced at the 14th meeting of CASP, the Critical Assessment of Structure Prediction group, where 100 teams of researchers from around the world meet biennially to present their results on using computational methods to solve protein structures.
A report on the achievement appears in the Nov. 30 issue of the journal Nature.
The goal is to compute the folding of particular proteins to closely match what is known using "gold standard" methods (X-ray crystallography, or cryogenic electron microscopy, or nuclear magnetic resonance). These traditional methods have played a role in the molecular revolution but are time-consuming and require expensive equipment.
The AlphaFold program scored 92.4 in the global distance test used to measure the computed versus the experimental distances in the protein folding. A score over 90 on this test is nearly equivalent to a structure determined by one of the "gold standard" techniques.
Over several months CASP sends out target proteins to the various research teams. The researchers work on predicting the folding patterns before submitting their results to CASP. An independent group of scientists then review the results, without knowing the identity of the teams.
Demis Hassabis, the founder and CEO of DeepMind, characterized CASP as the "Olympics of protein folding."
Proteins and protein folding
Proteins are complex molecules composed of chains of amino acids. The unique way the protein is folded determines what the protein can do. As an AlphaFold researcher succinctly put it, "Proteins are the fundamental building blocks that power everything living on this planet."
Being able to predict a protein structure accurately from its genetic code alone can help scientists understand the role of proteins in diseases, or to target specific proteins for drug development, or to identify proteins that can break down industrial waste.
Protein folding has been a daunting scientific problem for the last 50 years. The AlphaFold blog notes that Christian Anfinsen posed the problem in his acceptance speech for the 1972 Nobel Prize in Chemistry when he hypothesized that a protein's amino acid sequence should fully determine its structure.
The challenge is find a way to predict a protein's unique 3D folding structure from the vast number of possibilities: 10^300, that's 10 followed by 300 zeros.
The co-founder and chair of CASP, professor John Moult said of the breakthrough: "We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment."
Moult is a computational biologist at the University of Maryland.
What's next for AlphaFold? The group is working on a peer-reviewed paper for publication. It also intends to apply what the group has learned to other problems and to share their protein structure predictions with other research groups.
A two-minute video on protein folding can be found here.