Predictive Model with Machine Learning for Environmental Variables and PM<sub>2.5</sub> in Huachac, Junín, Perú

PM<sub>2.5</sub> pollution is increasing, causing health problems. The objective of this study was to model the behavior of PM<sub>2.5</sub>AQI (air quality index) using machine learning (ML) predictive models of linear regression, lasso, ridge, and elastic net. A total of 16...

Full description

Saved in:
Bibliographic Details
Main Authors: Emery Olarte, Jhonatan Gutierrez, Gwayne Roque, Juan J. Soria, Hugo Fernandez, Jackson Edgardo Pérez Carpio, Orlando Poma
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/16/3/323
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:PM<sub>2.5</sub> pollution is increasing, causing health problems. The objective of this study was to model the behavior of PM<sub>2.5</sub>AQI (air quality index) using machine learning (ML) predictive models of linear regression, lasso, ridge, and elastic net. A total of 16,543 records from the Huachac, Junin area in Peru were used with regressors of humidity in % and temperature in °C. The focus of this study is PM<sub>2.5</sub>AQI and environmental variables. Methods: Exploratory data analysis (EDA) and machine learning predictive models were applied. Results: PM<sub>2.5</sub>AQI has high values in winter and spring, with averages of 52.6 and 36.9, respectively, and low values in summer, with a maximum value in September (spring) and a minimum in February (summer). The use of regression models produced precise metrics to choose the best model for the prediction of PM<sub>2.5</sub>AQI. Comparison with other research highlights the robustness of the chosen ML models, underlining the potential of ML in PM<sub>2.5</sub>AQI. Conclusions: The predictive model found was <i>α</i> = 0.1111111 and a Lambda value <i>λ</i> = 0.150025, represented by PM<sub>2.5</sub>AQI = 83.0846522 − 10.302222000 (Humidity) − 0.1268124 (Temperature). The model has an adjusted R<sup>2</sup> of 0.1483206 and an RMSE of 25.36203, and it allows decision making in the care of the environment.
ISSN:2073-4433