LIU Jie, HAO Shu-xin, WAN Hong-yan, LIU Yue, XU Dong-qun. Comparison of three machine learning models for air quality level prediction: a case study of Baoding[J]. Journal of Environmental Hygiene, 2024, 14(3): 264-269,272. DOI: 10.13421/j.cnki.hjwsxzz.2024.03.013
    Citation: LIU Jie, HAO Shu-xin, WAN Hong-yan, LIU Yue, XU Dong-qun. Comparison of three machine learning models for air quality level prediction: a case study of Baoding[J]. Journal of Environmental Hygiene, 2024, 14(3): 264-269,272. DOI: 10.13421/j.cnki.hjwsxzz.2024.03.013

    Comparison of three machine learning models for air quality level prediction: a case study of Baoding

    • Objective To construct air quality level prediction models for the next three days in Baoding, China using the support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) independently, and to select the optimal model from the three models by tuning parameters and comparing the prediction result.
      Methods Based on the daily average concentration monitoring data of air pollutants and concurrent meteorological data in Baoding from 2014 to 2022, SVM, RF, and MLP models were constructed to forecast the air quality level for each of the next three days using the data of the previous four days, and the importance of feature variables was assessed. The model parameters were fine-tuned, and 10-fold cross-validation was performed. The performance of the models was evaluated using indicators including the accuracy rate and the area under the curve (AUC).
      Results For the SVM model, the accuracy rates for the next three days were 69.8%, 63.5%, and 62.3% respectively, and the AUC values were 77.4, 70.8, and 70.7, respectively. For the RF model, the accuracy rates for the next three days were 75.9%, 68.2%, and 67.1%, respectively, with AUC being 0.84, 0.74, and 0.72, respectively. For the MLP model, the accuracy rates for the next three days were 73.2%, 66.4%, and 65.7%, respectively, and the AUC values were 0.83, 0.74, and 0.73, respectively. The results indicated that the RF model showed the best performance. The importance of air quality feature variables was higher than that of meteorological feature variables.
      Conclusion Through comparison, the RF machine learning model can effectively predict the air pollution level for the next day and provide early warnings of air quality levels.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return