A hybrid variable selection strategy based on continuous shrinkage of variable space in multivariate calibration
文献类型: 外文期刊
作者: Yun, Yong-Huan 1 ; Bin, Jun 3 ; Liu, Dong-Li 1 ; Xu, Lin 2 ; Yan, Ting-Liang 2 ; Cao, Dong-Sheng 4 ; Xu, Qing-Song 5 ;
作者机构: 1.Hainan Univ, Coll Food Sci & Technol, Haikou 570228, Hainan, Peoples R China
2.Chinese Acad Trop Agr Sci, Inst Environm & Plant Protect, Haikou 571101, Hainan, Peoples R China
3.Guizhou Univ, Coll Tobacco Sci, Guiyang 550025, Guizhou, Peoples R China
4.Cent S Univ, Xiangya Sch Pharmaceut Sci, Changsha 410013, Hunan, Peoples R China
5.Cent S Univ, Sch Math & Stat, Changsha 410083, Hunan, Peoples R China
关键词: Variable selection; Near-infrared spectroscopy; Multivariate calibration; Variable combination population analysis; Iteratively retains informative variables; Genetic algorithm
期刊名称:ANALYTICA CHIMICA ACTA ( 影响因子:6.558; 五年影响因子:6.228 )
ISSN: 0003-2670
年卷期: 2019 年 1058 卷
页码:
收录情况: SCI
摘要: When analyzing high-dimensional near-infrared (NIR) spectral datasets, variable selection is critical to improving models' predictive abilities. However, some methods have many limitations, such as a high risk of overfitting, time-intensiveness, or large computation demands, when dealing with a high number of variables. In this study, we propose a hybrid variable selection strategy based on the continuous shrinkage of variable space which is the core idea of variable combination population analysis (VCPA). The VCPA-based hybrid strategy continuously shrinks the variable space from big to small and optimizes it based on modified VCPA in the first step. It then employs iteratively retaining informative variables (IRIV) and a genetic algorithm (GA) to carry out further optimization in the second step. It takes full advantage of VCPA, GA, and IRIV, and makes up for their drawbacks in the face of high numbers of variables. Three NIR datasets and three variable selection methods including two widely-used methods (competitive adaptive reweighted sampling, CARS and genetic algorithm-interval partial least squares, GA-iPLS) and one hybrid method (variable importance in projection coupled with genetic algorithm, VIP -GA) were used to investigate the improvement of VCPA-based hybrid strategy. The results show that VCPA-GA and VCPA-IRIV significantly improve model's prediction performance when compared with other methods, indicating that the modified VCPA step is a very efficient way to filter the uninformative variables and VCPA-based hybrid strategy is a good and promising strategy for variable selection in NIR. The MATLAB source codes of VCPA-GA and VCPA-IRIV can be freely downloaded in the website: https://cn.mathworks.com/matlabcentral/profile/authors/5526470-yonghuan-yun. (C) 2019 Elsevier B.V. All rights reserved.
- 相关文献
作者其他论文 更多>>
-
Molecular Cloning and Functional Analysis of ScHAK10 Gene Promoter from Sugarcane (Saccharum officinarum L.)
作者:Luo, Hai-Bin;Huang, Cheng-Mei;Cao, Hui-Qing;Wu, Xing-Jian;Ye, Li-Ping;Wei, Yuan-Wen;Xu, Lin;Wu, Kai-Chao;Deng, Zhi-Nian;Yi, Xiao-Ping
关键词:Saccharum officinarum L.; ScHAK10 promoter; Abiotic stress; Promoter analysis; Cis-acting element
-
Environmental damages, cumulative exergy demand, and economic assessment of Panus giganteus farming with the application of solar technology
作者:Cheng, Hanting;Zhou, Xiaohui;Yang, Yang;Xu, Lin;Ding, Ye;Yan, Tinglian;Li, Qinfen;Cheng, Hanting;Zhou, Xiaohui;Yang, Yang;Li, Qinfen;Xu, Lin;Ding, Ye;Li, Qinfen;Yan, Tinglian;Li, Qinfen
关键词:Life-cycle assessment; Cumulative exergy demand; Economic analysis; Mushroom; Photovoltaic
-
Integrative analysis of genome and transcriptome reveal the genetic basis of high temperature tolerance in pleurotus giganteus (Berk. Karun & Hyde)
作者:Yang, Yang;Pian, Yongru;Li, Jingyi;Xu, Lin;Li, Qinfen;Yang, Yang;Dai, Yueting;Yang, Yang;Pian, Yongru;Li, Jingyi;Xu, Lin;Li, Qinfen;Pian, Yongru;Li, Jingyi;Xu, Lin;Li, Qinfen;Lu, Zhu
关键词:Zhudugu; High temperature stress; Genome; Transcriptome analysis; Heat shock protein; Heat signal transduction; qPCR
-
Potential of Near-Infrared Spectroscopy (NIRS) for Efficient Classification Based on Postharvest Storage Time, Cultivar and Maturity in Coconut Water
作者:Shen, Xiaojun;Li, Xin;Deng, Fuming;Niu, Xiaoqing;Wang, Yuanyuan;Kan, Jintao;Shen, Xiaojun;Wei, Jingyi;Chen, Fusheng;Wang, Tao;Zhang, Weimin;Yun, Yong-Huan;Niu, Xiaoqing;Yun, Yong-Huan
关键词:Cocos nucifera L; liquid endosperm; dwarfs; non-destructive analysis; discrimination
-
Revealing informative metabolites with random variable combination based on model population analysis for metabolomics data
作者:Yun, Yong-Huan;Zhang, Jiachao;Chen, Haiming;Chen, Wenxue;Zhong, Qiuping;Zhang, Weimin;Chen, Weijun;Yun, Yong-Huan
关键词:Metabolomics; Variable selection; Biomarker discovery; Informative metabolites; Variable combination; Model population analysis
-
Three-step hybrid strategy towards efficiently selecting variables in multivariate calibration of near-infrared spectra
作者:Yu, Hai-Dong;Yun, Yong-Huan;Zhang, Weimin;Chen, Haiming;Liu, Dongli;Zhong, Qiuping;Chen, Wenxue;Chen, Weijun;Yun, Yong-Huan
关键词:Variable selection; Near-infrared spectra; Multivariate calibration; Hybrid strategy; Variable space
-
Comparative Metabolomic Analysis of Dendrobium officinale under Different Cultivation Substrates
作者:Zuo, Si-Min;Yu, Hai-Dong;Zhang, Weimin;Zhong, Qiuping;Chen, Wenxue;Chen, Weijun;Yun, Yong-Huan;Chen, Haiming;Yun, Yong-Huan;Yun, Yong-Huan
关键词:Dendrobium officinale; metabolomics; differential metabolites; cultivation substrates