Rubani Suri—McMaster Health Sciences 2026
The word “protein” has an ever-changing definition throughout our lives. As children, we often unknowingly consume protein in the form of nuggets or hamburgers. As we reach adolescence, proteins often appear on food guides and in our biology classes through the form of polypeptides and amino acids. However, not until the recent developments of AlphaFold2 Artificial Intelligence have proteins been defined as a complex three-dimensional network of amino acid residues.
To understand the significance of AlphaFold2, we must first understand the “protein folding problem.” Three-dimensional proteins are more than amino acid chains and are known for having multiple side chains on their structure. These side chains have the capacity of interacting with one another, creating configurational changes to the structure of the protein. As a result, it becomes nearly impossible to determine the structure of a three-dimensional protein due to side chain complexities (1).
That’s where AlphaFold2 comes in.
Using the power of Artificial Intelligence (AI), AlphaFold2 has mastered the technique of homology modeling: using evolutionary history to find proteins with known structure that are genetically similar to the “target protein,” and use them to deduce structural similarities with the target protein (2). Using comprehensive databases, AlphaFold2 uses AI to predict target protein structure through the following steps (3):
- The input sequence (genome of target protein) is inputted
- Multiple Sequence Alignments (MSAs), which are amino acid sequences that share evolutionary similarities with the target protein, are inputted into Alpha Fold machinery to create predictions for the structure based on evolutionary relatedness
- Protein database structures, which are similar in structure to the target, are also used as templates for target protein structures
- The input sequence is paired with itself in a matrix to produce an array of numbers that represents all the potential pairs of amino acid sequences in the target protein.
- The pair representations are put into “EvoFormer” technology, which collates all this data to analyze relationships between individual amino acids, to gain an understanding of the structures that specific amino acids would form when bonded to one another
- These predicted relationships are then put through a Structure Module technology, which builds a geometric protein model.
- This protein model is then analyzed, and the rotation and angle of each amino acid is calculated, creating a three-dimensional protein model.
- Side chains are predicted using a technology that detects ‘chi angles’ (angles between intersecting planes) on the three-dimensional residue structure.
- The bond lengths and angles are finalized by running the final structure through a relaxation step, which removes any inconsistencies within the protein structure.
- The final accuracy is then improved by running the predicted protein chain through the network three times more.
- Along with the predicted structure, the Alpha Fold technology creates two confidence matrices which provides a ‘confidence score’ for each angle between the residues by analyzing the predicted error in the predicted structure.
Figure 1. SOURCE: AlphaFold Protein Structure Database
Figure 1 depicts a structural prediction for a target protein once all the steps above are complete. The protein depicted in Figure 1 is hemoglobin, a globular transport protein found in erythrocytes.
Figure II. SOURCE: AlphaFold Protein Structure Database
Figure II depicts the confidence score of Hemoglobin, as determined in step 11 of the process shown above.
Although in its developmental stages, AlphaFold2 is a technological advancement that has the capacity to revolutionize both the pharmaceutical and biochemical world. This innovation has been groundbreaking, especially for pharmaceutical companies. This has been crucial as they are interested in the structure prediction of allosteric sites where small molecules can bind to produce cell responses such as inflammation, itching, and pain. Understanding the structure of these protein binding sites will allow drug developers to create specific inhibitors for these binding sites, preventing small molecules from binding and creating a painful response (4). The understanding of binding site structure will allow for the possibility of “structure-based drug design” (4), a technique that is estimated to accelerate the research and development of drugs from “years to months” (4).
In conclusion, the publicly accessible nature of AlphaFold2 protein structure data allows drug development companies to have readily available protein information at their fingertips, accelerating drug development and efficacy. Through its continued success, AlphaFold2 has the ability to revolutionize the pharmacological world, allowing for the accessibility of effective, fast-acting medications around the world.
- Singh J. The history of the protein folding problem: A seventy year symbiotic relationship between… [Internet]. Medium. Medium; 2020 [cited 2022Nov27]. Available from: https://medium.com/@jaguarsingh/the-history-of-the-protein-folding-problem-a-seventy-year-symbiotic-relationship-between
- Alessia David Person Envelope Suhail Islam Evgeny Tankhi levich Michael J.E.Sternberg, Highlights AlphaFold, et al. The alphafold database of protein structures: A biologist’s guide [Internet]. Journal of Molecular Biology. Academic Press; 2021 [cited 2022Nov27]. Available from: https://www.sciencedirect.com/science/article/pii/S0022283621005738
- Callaway E. What’s next for alphafold and the AI protein-folding revolution [Internet]. Nature News. Nature Publishing Group; 2022 [cited 2022Nov27]. Available from: https://www.nature.com/articles/d41586-022-00997-5#:~:text=AlphaFold%20deploys%20deep%2Dlearning%20neural,the%20PDB%20and%20other%20databases.
- Mullard A. What does alphafold mean for drug discovery? [Internet]. Nature News. Nature Publishing Group; 2021 [cited 2022Nov27]. Available from: https://www.nature.com/articles/d41573-021-00161-0