Researchers Teaching Artificial Intelligence about Frustration in Protein Folding

Published:25 Oct.2024 Source:Rice University

Scientists have found a new way to predict how proteins change their shape when they function, which is important for understanding how they work in living systems. While recent artificial intelligence (AI) technology has made it possible to predict what proteins look like in their resting state, figuring out how they move is still challenging because there is not enough direct data from experiments on protein motions to train the neural networks. Rice University's Peter Wolynes and his colleagues in China combined information about protein energy landscapes with deep-learning techniques to predict these movements.

Their method improves AlphaFold2 (AF2), a tool that predicts static protein structures by teaching it to focus on "energetic frustration." Proteins have evolved to minimize energetic conflicts between their parts, so they can be funneled toward their static structure. Starting from predicted static ground-state structures, the new method generates alternative structures and pathways for protein motions by first finding and then progressively enhancing the energetic frustration features in the input multiple sequence alignment sequences that encode the protein's evolutionary development. The researchers tested their method on the protein adenylate kinase and found that its predicted movements matched experimental data. They also successfully predicted the functional movements of other proteins that change shape significantly. The study also examined how AF2 works, showing that combining physical knowledge of the energy landscape with AI not only helps predict how proteins move but also explains why the AI overpredicts structural integrity, leading only to the most stable structures.

The energy landscape theory suggests that while evolution has sculpted the protein's energy landscape where they can fold into their optimal structures, deviations from a perfectly funneled landscape that otherwise guides the folding, called local frustration, are essential for protein functional movements. By pinpointing these frustrated regions, the researchers taught the AI to ignore those regions in guiding its predictions, thereby allowing the code to predict alternative protein structures and functional movements accurately. Using a frustration analysis tool developed within the energy landscape framework, researchers identified frustrated and therefore flexible regions in proteins. This research underscores the significance of not forgetting or abandoning physics-based methods in the post-AlphaFold era, where the emphasis has been on agnostic learning from experimental data without any theoretical input.