A Hybrid Machine Learning Approach for High-Accuracy Energy Consumption Prediction Using Indoor Environmental Quality Sensors
Accurate forecasting of energy consumption in buildings is essential for achieving energy efficiency and reducing carbon emissions. However, many existing models rely on limited input variables and overlook the complex influence of indoor environmental quality (IEQ). In this study, we assess the per...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-08-01
|
| Series: | Energies |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1996-1073/18/15/4164 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurate forecasting of energy consumption in buildings is essential for achieving energy efficiency and reducing carbon emissions. However, many existing models rely on limited input variables and overlook the complex influence of indoor environmental quality (IEQ). In this study, we assess the performance of hybrid machine learning ensembles for predicting hourly energy demand in a smart office environment using high-frequency IEQ sensor data. Environmental variables including carbon dioxide concentration (CO<sub>2</sub>), particulate matter (PM<sub>2.5</sub>), total volatile organic compounds (TVOCs), noise levels, humidity, and temperature were recorded over a four-month period. We evaluated two ensemble configurations combining support vector regression (SVR) with either Random Forest or LightGBM as base learners and Ridge regression as a meta-learner, alongside single-model baselines such as SVR and artificial neural networks (ANN). The SVR combined with Random Forest and Ridge regression demonstrated the highest predictive performance, achieving a mean absolute error (MAE) of 1.20, a mean absolute percentage error (MAPE) of 8.92%, and a coefficient of determination (R<sup>2</sup>) of 0.82. Feature importance analysis using SHAP values, together with non-parametric statistical testing, identified TVOCs, humidity, and PM<sub>2.5</sub> as the most influential predictors of energy use. These findings highlight the value of integrating high-resolution IEQ data into predictive frameworks and demonstrate that such data can significantly improve forecasting accuracy. This effect is attributed to the direct link between these IEQ variables and the activation of energy-intensive systems; fluctuations in humidity drive HVAC energy use for dehumidification, while elevated pollutant levels (TVOCs, PM<sub>2.5</sub>) trigger increased ventilation to maintain indoor air quality, thus raising the total energy load. |
|---|---|
| ISSN: | 1996-1073 |