Breakthrough in Protein Structure PredictionDecember 21, 2020

Knowing the three-dimensional structure of a particular protein is critical in determining its function and has implications for everything from treating human disease to herbicide resistant plants. Researchers currently use experimental techniques such as x-ray crystallography, nuclear magnetic resonance, or cryo-electron microscopy to establish the structure. However, these techniques are time consuming and can take years for each protein. To date, the structures of only about 170,000 proteins have been experimentally determined. Being able to accurately predict the three-dimensional structure of a protein based on its amino acid sequence has been a goal of computational biology since the 1960s. The challenge is incredibly difficult, at least in part due to the unfathomably large number of theoretical possible structures for a given sequence. Cyrus Leventhal, one of the early pioneers in the field, noted that attempting to arrive at the correct structure by a random search of all the possibilities would require a time longer than the age of the universe.

In 1994, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) began as a biennial event to assess the performance of the latest computational methods. The participating groups are provided with the amino acid sequence of proteins with structures that are either soon to be solved experimentally or have already been solved but not yet publicly released. The computational predictions are compared to the experimental results to arrive at a global distance test (GDT) score. A score above 90 on a scale of 0 to 100 is considered on par with the experimental methods. While there were certainly improvements over the years, the highest GDT scores typically hovered around 40 and a true breakthrough in computational prediction still seemed a distant fantasy. That changed with the announcement of the results of this year’s CASP event—Deepmind’s AlphaFold achieved a median GDT score of 92.4.

Alphafold is a deep learning system trained on the existing experimentally determined protein structures. Deepmind says the main neural network model uses evolutionarily related protein sequences as well as amino acid residue pairs, iteratively passing information between both representations to generate a structure. Now other groups are waiting to learn exactly how AlphaFold works. DeepMind’s press release notes that a peer-reviewed publication is in preparation and that they are exploring how to enable others to use their structure predictions. While there is always room further improvement and refinement, the performance by Alphafold at this year’s CASP is a remarkable achievement.

Brian D. Keppler, Ph.D. is a registered Patent Agent in the MVS  Biotechnology Chemical Practice Groups. To learn more, visit our MVS website, or contact Brian directly via email.

← Return to Filewrapper

Stay in Touch

Receive the latest news and updates from us and our attorneys.

Sign Up