多模型模拟的PM<sub>2.5</sub>浓度的时空变化及模拟效能比较

臧加伟; 王情; 高祥伟; 许怀悦

doi:10.13421/j.cnki.hjwsxzz.2023.01.003

多模型模拟的PM_2.5浓度的时空变化及模拟效能比较

Comparison of simulated PM_2.5 concentration for spatio-temporal variations and their simulation efficiency by multiple models

摘要

摘要:
目的对目前国内外常见的不同模型方法模拟的PM_2.5浓度数据集进行时空变化及模拟效能的比较。
方法收集了2013—2020年国内外公开发表或共享的9套全国PM_2.5浓度模拟数据集。通过统计学分析和ArcGIS软件制图功能对9套PM_2.5浓度数据集的时空分布趋势进行对比。采用PyCharm软件对4套日值模型模拟的数据集进行回归评价分析。
结果通过比对分析发现, 不同模型在局部地区的模拟值高低、范围存在一定差异, 但是各类模型模拟结果空间分布整体相似, 呈现中东部高, 西部偏低的空间趋势。除GBD数据集外, 其余8套数据集的PM_2.5浓度总体呈现降低趋势, 季节上呈现出冬季最高、春秋次之、夏季最低的季节规律。日值模型中随机森林模型模拟效能最佳, R²为0.76, 且具有较低的均方根误差(RMSE, 21.96)。月值模型中时空—极端随机树模型模拟效能最佳, R²为0.98, 且具有较低的RMSE(3.26)。
结论各个模型模拟得到的PM_2.5浓度时空分布相似。其中非线性机器学习模型的模拟效能优于大气化学模型和线性回归模型。未来可综合非线性和集成机器学习等模型的优点, 采用集成模型来模拟PM_2.5浓度数据, 进一步提高模型的时空分辨率和模拟效能。

Abstract:
Objective To compare the spatio-temporal variations and their simulation efficiency of PM_2.5 concentration datasets simulated by different models.
Methods Nine sets of national PM_2.5 concentration simulation data that were published or shared by Chinese and international researchers from 2013 to 2020 were collected. The spatial and temporal distribution patterns of the nine datasets were compared by statistical analysis and ArcGIS mapping. PyCharm was used to conduct regression evaluation on four datasets simulated by daily-value models.
Results The simulation result by different models showed different levels and rangs of simulation values in local areas, but had similar spatial distributions in general, which tended to be higher in the central and eastern parts and lower in the western regions. Except GBD dataset, PM_2.5 concentrations of the other eight datasets all showed an overall decreasing trend, and showed the same seasonal trend, which was the highest in winter, followed by spring and autumn, and the lowest in summer. Among daily-value models, the random forest model demonstrated the best simulation performance (R²=0.76), with relatively low root mean square error (RMSE, 21.96). Among monthly-value models, the space-time extremely randomized tree model showed the best simulation performance (R²=0.98), with relatively low RMSE (3.26).
Conclusion The simulation datasets show similar spatio-temporal distributions of PM_2.5 concentrations. Nonlinear machine learning models have superior simulation performance to atmospheric chemistry models and linear regression models. In the future, the advantages of nonlinear and ensemble machine learning models can be combined to simulate PM_2.5 concentration data, which may further improve spatio-temporal resolution and simulation efficiency of the model.

HTML全文

参考文献(35)

施引文献

资源附件(0)

多模型模拟的PM2.5浓度的时空变化及模拟效能比较

Comparison of simulated PM2.5 concentration for spatio-temporal variations and their simulation efficiency by multiple models

多模型模拟的PM_2.5浓度的时空变化及模拟效能比较

Comparison of simulated PM_2.5 concentration for spatio-temporal variations and their simulation efficiency by multiple models