Predicting Energy-Based CO2 Emissions in the United States Using Machine Learning: A Path Toward Mitigating Climate Change

文献类型: 外文期刊

第一作者: Tian, Longfei

作者: Tian, Longfei;Zhang, Zhen;Xie, Yinghui;He, Zhiru;Yuan, Chen;Jing, Ran;Zhang, Kun

作者机构:

关键词: climate change; carbon dioxide emissions; life cycle assessment; decision tree; random forest; multiple linear regression; K-nearest neighbors; gradient boosting; support vector regression

期刊名称:SUSTAINABILITY ( 影响因子:3.3; 五年影响因子:3.6 )

ISSN:

年卷期: 2025 年 17 卷 7 期

页码:

收录情况: SCI

摘要: Climate change is one of the most pressing global challenges that could potentially threaten ecosystems, human populations, and weather patterns over time. Impacts including rising sea levels and soil salinization are caused by climate change, primarily driven by human activities such as fossil fuel combustion for energy production. The resulting greenhouse gas (GHG) emissions, particularly carbon dioxide (CO2) emissions, amplify the greenhouse effect and accelerate global warming, underscoring the urgent need for effective mitigation strategies. This study investigates the performance and outcomes of various machine learning regression models for predicting CO2 emissions. A comprehensive overview of performance metrics, including R2, mean absolute error, mean squared error, and root-mean-squared error, and cross-validation scores for decision tree, random forest, multiple linear regression, k-nearest neighbors, gradient boosting, and support vector regression models was conducted. The biggest source of CO2 emissions was coal (46.11%), followed by natural gas (25.49%) and electricity (26.70%). Random forest and gradient boosting both performed well, but multiple linear regression had the highest prediction accuracy among machine learning models (R2 = 0.98 training, 0.99 testing). Support vector regression (SVR) and k-nearest neighbors (KNN) demonstrated lower accuracies, whereas decision tree displayed overfitting. The decision tree, random forest, multiple linear regression, and gradient boosting models were found to be extremely sensitive to coal, natural gas, and petroleum (transportation sector) based on sensitivity analysis. Random forest and gradient boosting demonstrated the most sensitivity to coal usage, whereas KNN and SVR maintained excellent R2 scores (0.94-0.98) but were less susceptible to changes in the variables. This analysis provides insights into the agreement and discrepancies between predicted and actual CO2 emissions, highlighting the models' effectiveness and potential limitations.

分类号:

  • 相关文献
作者其他论文 更多>>