Objective To establish a prediction model for asthma risk by integrating four machine learning algorithms, and provide a basis for healthy weather forecast services and public defense.
Methods The daily medical data of asthma patients from 2012 to 2018 were collected from a grade A tertiary hospital in Tianjin, as well as meteorological, environmental, and pollen data during the same period of time. A principal component analysis was used to select the optimal factors, and the Stacking integrated learning method was used to integrate the four machine learning algorithms of Decision Tree, Random Forest, XGBoost, and LightGBM. Model performance was optimized by adjusting the optimal risk level threshold, time lag, and seasonality.
Results Random forest modeling had a better predictive effect than Decision Tree, XGBoost, and LightGBM. Multi-model integration was performed based on the four sub-models, and compared with the Random Forest model, the integrated model was improved by about 13% in its forecasting ability for the grades of easy occurrence and multiple occurrence. In case of a time lag of 2-3 days and modeling for different seasons, the predictive ability of the model was further improved.
Conclusion The multi-model integration method that comprehensively considers various meteorological, environmental, and pollen factors can be applied to the meteorological forecasting business and services of asthma disease.