This study used four individual machine learning (ML) models (random forest, adaptive boosting, gradient boosting, and extreme gradient boosting), and a stacked ensemble model (SEM) for PM2.5 estimation over Greater Bangkok (GBK) during the dry season for 2018-2022. Aerosol optical depth (AOD) from Fengyun-4A satellite was used as the main predictor variable. The other predictor variables include meteorological variables, fire hotspots, vegetation index, terrain elevation, and population density. Surface PM2.5 from 17 air quality monitoring stations was used for model development and evaluation. Satellite AOD aligns reasonably well with AOD from two AERONET stations in the study area in terms of correlation coefficient (r), mean bias (MB), mean error (ME), and root mean square error (RMSE). Among the individual models, adaptive boosting performed the best with r = 0.75, MB = 0.55 mu g m(-3), ME = 9.1 mu g m(-3), and RMSE = 12.9 mu g m(-3). As for SEM which comprises all individual models, it outperformed every individual model, with r = 0.84, zero MB, ME = 7.2 mu g m(-3), and RMSE = 10.4 mu g m(-3). In two additional cases of haze hours and clean hours, SEM is best overall while adaptive boosting is superior to the other individual ML models. The case of haze hours has lower model predictability, suggesting elevated PM2.5 is difficult to predict. SEM was thus chosen to map PM2.5 as well as exposure intensity over GBK. Good agreement between the observed and predicted diurnal and monthly patterns is achieved by every model. PM2.5 tends to be relatively high at 08-10 LT and declines in later hours, corresponding to higher traffic emissions in the morning and daytime meteorological conditions more favorable to dilute air pollutants, respectively. PM2.5 intensifies in the winter but decreases in March and April. During these two months, the areas outside Bangkok tend to have higher PM2.5 than within Bangkok, possibly linked to active summertime biomass burning in those areas that are less urbanized with more agricultural lands. Relatively high exposure intensity is constrained to Bangkok due likely to its much denser population. The findings indicate a significant potential for leveraging the Fengyun-4A satellite data and ML to advance space-based air quality monitoring for Thailand and other data-scare regions in Southeast Asia. A satellite-based PM2.5 dataset could support the formulation of effective air quality management strategies in GBK.
View source