The Center of Molecular and Biomolecular Informatics (CMBI) is leading research on bridging artificial intelligence (AI) and 3D modelling of protein structures. Recently, on 3 December 2021, Li Xue et al., theme Cancer development and immune defence, published DeepRank, a deep learning framework for data mining 3D protein-protein structures, in Nature Communications and GitHub. DeepRank makes deep learning (DL) accessible for broad biochemists and life scientists.
What can it be used for?
DeepRank provides an easy DL platform; researchers come up with their own research questions related with 3D protein structures. DeepRank takes 3D protein-protein structures as input, and outputs a prediction. The output of DeepRank is defined based on the users’ research question and their training data. For example, how two proteins interact with each other in 3D space and where do they interact (i.e., which part of a protein interacts with which part of another protein)? Whether a mutation is disease-causing or not? Is the protein-protein interaction that we observe in X-ray experiments a biological interaction or crystal artefacts? And many many more.
Image: illustration of DeepRank
Protein functions are encoded in their 3D shapes. In the past decades, a number of imaging techniques were developed (e.g., X-ray, NMR, Cryo-EM) and a large number of experimentally determined 3D protein structures have been accumulated. A human expert trained for 30+ years could excruciatingly examine them by eyes as the spatial arrangement of atom pairs and interactions shed light on the secrets of biological life. However, human inspection is not efficient. And the relationship between 3D structures of diverse proteins and their functions is too intricate to be even mastered by human experts.
Using quantitative statistical approach to approximate/simulate human perception was visioned and pioneered by Frank Rosenblatt, a psychologist, in 1958. After a half-century, this dream finally came true recently when deep learning achieved human-level accuracy in 2D image perceptions. However, such breakthroughs did not naturally translate to molecular biology.
DeepRank aims to facilitate such translations. Removing daunting phases of data preprocessing on millions of structures, DeepRank allows a user to easily train a 3D-CNN to scan protein structures for desired patterns and make predictions. In the paper, they showcase the effectiveness of DeepRank on two distinct applications in structural biology.
Image: Xue's lab logo showing their research theme: AI-boosted 3D modeling
Potential applications are wide. Xue’s lab is using DeepRank to aid cancer vaccine design. CMBI is extending DeepRank to predict pathogenicity of human genetic variants. Together with Utrecht University, DeepRank is further leveled up with Graph Neural Networks, a trending DL technique that can be used for protein interaction networks and so on.
We envision DeepRank to stimulate community efforts of exploiting deep learning to tackle long-standing challenges in life science.
This work is a result of collaborations with the Netherlands eScience Center and Utrecht University.