A Visibility-Based Historical PM2.5 Estimation for Four Decades (1981–2022) Using Machine Learning in Thailand: Trends, Meteorological Normalization, and Influencing Factors Using SHAP Analysis

Abstract Introduction PM2.5 pollution is a significant environmental and health concern in Thailand, with levels intensifying during the dry season. However, the lack of long-term PM2.5 data limits understanding of historical trends and meteorological influences. Objective This study aims to reconst...

Full description

Saved in:
Bibliographic Details
Main Authors: Nishit Aman, Sirima Panyametheekul, Ittipol Pawarmart, Sumridh Sudhibrabha, Kasemsan Manomaiphiboon
Format: Article
Language:English
Published: Springer 2025-03-01
Series:Aerosol and Air Quality Research
Subjects:
Online Access:https://doi.org/10.1007/s44408-025-00007-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Introduction PM2.5 pollution is a significant environmental and health concern in Thailand, with levels intensifying during the dry season. However, the lack of long-term PM2.5 data limits understanding of historical trends and meteorological influences. Objective This study aims to reconstruct historical PM2.5 data from 1981 to 2022 and analyze the influence of various contributing factors across six key provinces in Thailand: Chiang Mai (CM), Lampang (LP), Khon Kaen (KK), Bangkok (BK), Chonburi (CB), and Songkhla (SK). Methods A Light Gradient Boosting Machine (LightGBM) model was developed using meteorological and aerosol-related variables from the Thai Meteorological Department and MERRA-2. The model was trained on PM2.5 data spanning 2012–2022, depending on availability for each province. Model performance was evaluated across diurnal, monthly, and annual scales and then used for historical reconstruction of PM2.5 data. SHAP analysis was used to determine the important predictor variables affecting PM2.5 prediction. Results The LightGBM model accurately predicted PM2.5 across all provinces, showing better performance for daily prediction than for hourly prediction. Model accuracy was higher during clean hours than during haze hours. Good agreement between observed and predicted PM2.5 was found on different time scales (diurnal, monthly, and annually). CM shows a non-significant PM2.5 trend, limiting insights into meteorological effects, while LP exhibits significant decreases in PM2.5 and PM2.5_emis, indicating positive weather impacts on air quality. In contrast, regions like KK, BK, and CB display worsening meteorological influences, with non-significant or increasing PM2.5 trends despite declines in PM2.5_emis. In SK, removing meteorological effects reveals a decreasing PM2.5 trend, underscoring the critical role of meteorology. SHAP analysis identified visibility, gridded PM2.5, and specific humidity at 2 m as common and important predictor variables over all the provinces, along with additional variables that were not consistent over different provinces. Conclusion The LightGBM model effectively reconstructs historical PM2.5 levels and provides insight into meteorological influences on air quality. Based on the findings of the study, some policy implications have also been provided. Graphical abstract
ISSN:1680-8584
2071-1409