DualF-PBR: Dual-Extracting Protein Sequence Features for Predicting Plant Resistance Proteins

文献类型: 外文期刊

第一作者: Fang, Hui

作者: Fang, Hui;Fang, Hui;Fang, Hui;Chen, Danyang;Tang, Chunyan;Zhong, Cheng;Fang, Hui;Chen, Danyang;Tang, Chunyan;Zhong, Cheng;Fang, Hui;Li, Min

作者机构:

关键词: Feature extraction; Immune system; Protein sequence; Encoding; Neural networks; Amino acids; Data mining; Vectors; Pathogens; Bioinformatics; Plant resistance protein; Prediction; extracting features; disease-resistant breeding

期刊名称:IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

ISSN:

年卷期: 2025 年 22 卷 4 期

页码:

收录情况: SCI

摘要: Plant resistance proteins are evolved during growth and development to cope with complex environmental changes and infection of pathogens. Predicting plant resistance proteins is of great significance for further exploring plant disease resistance mechanism against viruses. In this paper, we propose a method for predicting plant resistance protein by dual-extracting features. The dual-extracted features are composed of the features extracted by modeling self-attention neural network and detecting sequence structure information respectively to obtain 2381-dimensional protein sequence features. We utilize the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm to eliminate redundant features from the extracted 2381-dimensional features to form 53 key features. These 53 key features are inputted into the Lightweight Gradient Boosting Machine (LightGBM) model to predict plant resistance proteins. Experimental results of five-fold cross-validation on real datasets demonstrate that our proposed prediction method outperforms existing methods overall in accuracy, sensitivity, specificity, Matthews correlation coefficient, F1 score, and area under the curve (AUC) in the case of slightly imbalanced datasets. This research work will aid in filtrating plant resistance genes and proteins, and promote disease-resistant breeding for plants.

分类号:

  • 相关文献
作者其他论文 更多>>