Multi-random ensemble on Partial Least Squares regression to predict wheat yield and its losses across water and nitrogen stress with hyperspectral remote sensing

文献类型: 外文期刊

第一作者: Mao, Bohan

作者: Mao, Bohan;Sun, Xiaoxiao;Li, Hao;Mao, Bohan;Cheng, Qian;Duan, Fuyi;Li, Yafeng;Li, Zongpeng;Zhai, Weiguang;Ding, Fan;Chen, Zhen;Chen, Li

作者机构:

关键词: Hyperspectral; Machine learning; Model transfer; Yield prediction; Breeding

期刊名称:COMPUTERS AND ELECTRONICS IN AGRICULTURE ( 影响因子:7.7; 五年影响因子:8.4 )

ISSN: 0168-1699

年卷期: 2024 年 222 卷

页码:

收录情况: SCI

摘要: The integration of regression techniques with remote sensing has proved to be a highly advantageous approach for estimating crop yield in various plant species. This study collected canopy hyperspectral data at multiple growth stages under water and nitrogen stress conditions, and combined with machine learning to predict wheat yield, and evaluated the performance degradation of models across stress. Model performance of Random Forest Regression (RFR), Partial Least Squares Regression (PLSR), and the Multi-random Ensemble on PLSR (MREPLSR) algorithms were quantified using the pearson correlation coefficient (PCC) and mean absolute error (MAE). For each dataset composed of canopy hyperspectral data and yield, it was paired with a dataset from the same location under different stress conditions during the same stage to form a combination for validating model performance. Among all combinations, PLSR exhibited superior prediction accuracy compared to RFR. And MREPLSR further improved PCC by an average of 14.5 % compared to PLSR. In the combinations where the wheat growth environments differed the most between the training set and testing sets, MRE-PLSR showed significant improvement of PCC, reaching up to 37.5 %. Without setting a random seed, the algorithm was run 100 times on different computers, and the performance remained stable across all combinations, thus validating the replicability of this study. Subsequently, this study validated the transferability of MRE-PLSR. One dataset was designated as the target dataset, and a small number of transfer samples were randomly extracted from another dataset from the same region. These samples were used to update the model trained on a mixture of two datasets from another regions. The results indicate that using the updated model has a better fit to the measured yield compared to using a original model from another location, with an average reduction of 37 t/hm2 in MAE. The proposed method provides a promising solution for predicting wheat yield and its losses.

分类号:

  • 相关文献
作者其他论文 更多>>