DeepPFP: a multi-task-aware architecture for protein function prediction

文献类型: 外文期刊

第一作者: Bo, Xiaochen

作者: Bo, Xiaochen;Xue, Jiguo;Ni, Ming;Wang, Han;Sun, Jinghong;Gao, Jingyang;Ren, Zilin;Chen, Yongbing;Ren, Zilin;Chen, Yongbing

作者机构:

关键词: protein function prediction; SARS-CoV-2; deep learning; meta learning

期刊名称:BRIEFINGS IN BIOINFORMATICS ( 影响因子:7.7; 五年影响因子:8.7 )

ISSN: 1467-5463

年卷期: 2025 年 26 卷 1 期

页码:

收录情况: SCI

摘要: Deriving protein function from protein sequences poses a significant challenge due to the intricate relationship between sequence and function. Deep learning has made remarkable strides in predicting sequence-function relationships. However, models tailored for specific tasks or protein types encounter difficulties when using transfer learning across domains. This is attributed to the fact that protein function relies heavily on structural characteristics rather than mere sequence information. Consequently, there is a pressing need for a model capable of capturing shared features among diverse sequence-function mapping tasks to address the generalization issue. In this study, we explore the potential of Model-Agnostic Meta-Learning combined with a protein language model called Evolutionary Scale Modeling to tackle this challenge. Our approach involves training the architecture on five out-domain deep mutational scanning (DMS) datasets and evaluating its performance across four key dimensions. Our findings demonstrate that the proposed architecture exhibits satisfactory performance in terms of generalization and employs an effective few-shot learning strategy. To explain further, Compared to the best results, the Pearson's correlation coefficient (PCC) in the final stage increased by similar to 0.31%. Furthermore, we leverage the trained architecture to predict binding affinity scores of the DMS dataset of SARS-CoV-2 using transfer learning. Notably, training on a subset of the Ube4b dataset with 500 samples resulted in a notable improvement of 0.11 in the PCC. These results underscore the potential of our conceptual architecture as a promising methodology for multi-task protein function prediction.

分类号:

  • 相关文献
作者其他论文 更多>>