What is the protein folding problem that has left researchers stuck for nearly 50 years?
Knowing the 3D shape of proteins is so important for our understanding of various diseases and vaccine development. However, these shapes are fantastically complex and difficult to predict. Researchers have spent years trying to determine the 3D structure of proteins.
Thanks to AI systems like AlphaFold, it is now much easier and faster to predict protein shapes. AlphaFold is currently leading the way in protein folding research and has been described as a “revolution in biology.”
To learn more about the protein folding problem and how AlphaFold was developed to accelerate our understanding of protein structures, watch the video above, listen to this episode of Short and Sweet AI below, or keep reading.
What is the “protein folding problem”?
Known as one of the biggest challenges in biology, protein folding is a problem that researchers have been stuck on for nearly 50 years.
Researchers will spend years and even decades trying to determine a protein’s 3D shape from its amino-acid sequence. Some never succeed in figuring it out.
The reason it’s so difficult is because the 3D shapes of proteins are fantastically complex. Proteins are made up of strings of amino acids, sometimes referred to as the “building blocks of life.”
These strings twist and fold into precise, delicate shapes that wrap around each other. These strings can even merge into bigger, megaplex structures. The protein’s shape defines exactly what the protein can and cannot do.
The problem is that there is an astronomical number of ways proteins can fold into their 3D structure. This is called the Levinthal’s Paradox.
Cyrus Levinthal, a molecular biologist, published a paper in 1969 called “How to Fold Graciously.” He discovered that there were so many degrees of freedom in an unfolded chain of amino acids that molecules have an enormous number of possible configurations.
As far as we know, there are an estimated 200 million known proteins, with 30 million new ones discovered each year. Each one has a unique 3D shape which determines how it works.
For the last 50 years, biologists have discovered the exact 3D structure of only a tiny fraction of these known proteins.
Why is protein folding so important?
The reason researchers are so determined to figure out the 3D shape of proteins is because doing so helps us better understand diseases.
With more knowledge of how diseases work, this enables us to develop new medicines and vaccines.
How researchers can use technology to predict protein structures
To solve the protein folding problem, scientists created a global competition called “Critical Assessment of Structure Protein” (CASP). For this competition, scientists measure and compare research efforts using computer-based predictions. The competition originally started in 1994 to improve computational methods for accurately predicting a protein’s 3D shape.
This is where DeepMind came in. DeepMind is the AI research lab owned by Google. The lab has already made headlines for its deep learning neural networks AlphaGo and AlphaZero. These neural networks managed to beat world-leading chess and Go champions.
With protein folding still a big problem, researchers at DeepMind turned their attention away from chess to something that would make a real impact in the world.
That’s when DeepMind went to work creating AlphaFold, a deep learning computer system designed to finally solve the protein folding problem.
How did AlphaFold perform?
In 2018, AlphaFold entered the CASP competition for the first time, and it got off to a great start. AlphaFold managed to achieve the highest score for accurately predicting various protein structures. It scored 60 out of a possible 100 points.
While impressive, AlphaFold researchers believed they could do better. They wanted to improve the accuracy and got to work on developing the neural network even further.
AlphaFold used a data set with 170,000 protein structures, and DeepMind supercharged the algorithm. They added data about physics, geometry, and even evolutionary history to the network’s training model.
This enabled the algorithm to seek out and analyze any buried relationships or patterns in the structures. The system managed to determine highly accurate structures in a matter of days or even hours. It could predict a protein’s 3D shape down to the width of an atom.
The turning point for AlphaFold
The turning point came two years later when AlphaFold entered the CASP competition in November 2020. AlphaFold, and teams from Microsoft and Tencent, competed to predict protein structures that were considered moderately difficult.
The best performance of the other teams was 75 out of 100 points. However, AlphaFold scored 90 out of 100, performing so unbelievably well, it was called a “revolution in biology.”
This blew researchers away. One researcher who had studied the structure of a protein for ten years saw AlphaFold predict it in just half an hour. He said, “this will change medicine. It will change research. It will change bioengineering. It will change everything.”
Another researcher said, “I nearly fell off my chair when I saw these results.” Someone else commented, “It’s a breakthrough of the first order, certainly one of the most significant results of my lifetime.”
John Moult, a professor who helped to set up the CASP competition, described it as his dreams coming true. He said, “I always hoped I would live to see this day. But it wasn’t always obvious I was going to make it.”
The reaction to AlphaFold’s discoveries shows just how important the protein folding problem has been to researchers. Experts suggest that AlphaFold’s work is not just a big story, but it rivals the discovery of DNA. It will be fascinating to see if DeepMind develops this neural network even further.