Informative gene selection and the direct classification of tumors based on relative simplicity

文献类型: 外文期刊

第一作者: Yuan, Zheming

作者: Yuan, Zheming;Chen, Yuan;Li, Lanzhi;Zhang, Hongyan;Yuan, Zheming;Wang, Lifeng

作者机构:

关键词: Microarray expression data;Gene selection;Direct classify;Relative simplicity;Binary-discriminative informative genes;Paired votes

期刊名称:BMC BIOINFORMATICS ( 影响因子:3.169; 五年影响因子:3.629 )

ISSN: 1471-2105

年卷期: 2016 年 17 卷

页码:

收录情况: SCI

摘要: Background: Selecting a parsimonious set of informative genes to build highly generalized performance classifier is the most important task for the analysis of tumor microarray expression data. Many existing gene pair evaluation methods cannot highlight diverse patterns of gene pairs only used one strategy of vertical comparison and horizontal comparison, while individual-gene-ranking method ignores redundancy and synergy among genes. Results: Here we proposed a novel score measure named relative simplicity (RS). We evaluated gene pairs according to integrating vertical comparison with horizontal comparison, finally built RS-based direct classifier (RS-based DC) based on a set of informative genes capable of binary discrimination with a paired votes strategy. Nine multi-class gene expression datasets involving human cancers were used to validate the performance of new method. Compared with the nine reference models, RS-based DC received the highest average independent test accuracy (91.40 %), the best generalization performance and the smallest informative average gene number (20.56). Compared with the four reference feature selection methods, RS also received the highest average test accuracy in three classifiers (Naive Bayes, k-Nearest Neighbor and Support Vector Machine), and only RS can improve the performance of SVM. Conclusions: Diverse patterns of gene pairs could be highlighted more fully while integrating vertical comparison with horizontal comparison strategy. DC core classifier can effectively control over-fitting. RS-based feature selection method combined with DC classifier can lead to more robust selection of informative genes and classification accuracy.

分类号:

  • 相关文献

[1]Reconstruction of metabolic network in the bovine mammary gland tissue. Wang, Changfa,Wang, Ji,Ju, Zhihua,Zhai, Ruiyan,Zhou, Lei,Li, Qiuling,Li, Jianbin,Li, Rongling,Huang, Jinming,Zhong, Jifeng.

作者其他论文 更多>>