High-performance prediction of soil organic carbon using automatic hyperparameter optimization method in the yellow river delta of China☆

文献类型: 外文期刊

第一作者: Song, Yingqiang

作者: Song, Yingqiang;Wang, Feng;Yang, Weihao;Liang, Ruilin;Zhan, Dexi;Xiang, Meiyan;Yang, Xiaohang;Xu, Rui;Song, Yingqiang;Lu, Miao;Lu, Miao

作者机构:

关键词: Hyperparameter; Machine learning; Deep learning; Soil organic carbon; Farmland

期刊名称:COMPUTERS AND ELECTRONICS IN AGRICULTURE ( 影响因子:8.9; 五年影响因子:9.3 )

ISSN: 0168-1699

年卷期: 2025 年 236 卷

页码:

收录情况: SCI

摘要: Using machine learning (ML) and deep learning (DL) models to predict the spatial variability of soil organic carbon (SOC) is crucial for advancing carbon emission reduction strategies. However, inadequate hyperparameter tuning remains a key limitation, reducing the model fitting performance and prediction accuracy. Notably, high-performance models enabled by automatic hyperparameter optimization (AHPO) represent a novel approach to explain the complex relationships between environmental factors and SOC. In this study, we analyzed the prediction performance of ML models, such as gradient boosting decision tree (GBDT) and extreme gradient boosting (XGB), and DL models, including deep forest (DF) and convolutional neural network (CNN). These models were optimized using nature-inspired algorithms (grey wolf optimization (GWO) and hunter-prey optimization (HPO)) and mathematical-approximation algorithms (Bayesian optimization (BO) and tree-structured Parzen estimator (TPE). Furthermore, we derived the linear and nonlinear driving effects of environmental factors (soil, vegetation, texture, climate, and terrain) on SOC. We also identified direct and indirect response pathways using SHapley additive interpretation (SHAP), variogram decomposition (VD), hierarchical partitioning (HP), and structural equation model (SEM). Our results show that prediction models optimized with mathematical approximation algorithms, such as BO-DF (R-2 = 0.76) and TPE-DF (R-2 = 0.82), demonstrated the strongest nonlinear fitting ability between environmental factors and SOC. AHPO algorithms significantly improved the prediction performance of DL models, with R-2 values for the four optimization methods increasing from 0.72 to 0.82. The generalization verification results indicate that the TPE-optimized model demonstrates strong robustness and achieves the highest accuracy (R-2 > 0.7) for SOC prediction. The AHPO prediction model's hyperparameter combination achieves a balance between similarity and distinctiveness, where key performance-determining hyperparameters exhibit significant variation (i.e. non-similarity), enabling high-performance SOC predictions. The spatial mapping using the TPE-DF model revealed that areas with high SOC content are primarily concentrated in the southern and northeastern regions of the study area. Moreover, when the model's prediction accuracy (R-2) exceeds 0.75, SHAP analysis identifies SoilAN, SoilAP, SoilAK, TMP, and PRE as the most influential environmental factors driving nonlinear changes in SOC. Similarly, VD and HP analyses highlight a synergistic linear contribution of soil and climate factors, accounting for 99.1 % of the variability in SOC. Interestingly, the path analysis further indicates that regional climate warming leads to surface soil desiccation and salinization, which significantly alters the SOC decomposition environment. High salt stress negatively affects microorganisms and crop root activity, ultimately enhancing SOC accumulation in surface soil. Overall, AHPO-empowered ML and DL methods exhibit strong feasibility for analyzing the response relationship between environmental factors and SOC. Therefore, these methods provide robust support for high-performance and high-precision SOC monitoring across spatial scales.

分类号:

  • 相关文献
作者其他论文 更多>>