Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
Sensors are prone to malfunction, leading to blank or erroneous measurements that cannot be ignored in most practical applications. Therefore, data users are always looking for efficient methods to substitute missing values with accurate estimations. Traditionally, empirical methods have been used f...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2024-12-01
|
Series: | AIMS Geosciences |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/geosci.2024044 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832590241352908800 |
---|---|
author | Konstantinos X Soulis Evangelos E Nikitakis Aikaterini N Katsogiannou Dionissios P Kalivas |
author_facet | Konstantinos X Soulis Evangelos E Nikitakis Aikaterini N Katsogiannou Dionissios P Kalivas |
author_sort | Konstantinos X Soulis |
collection | DOAJ |
description | Sensors are prone to malfunction, leading to blank or erroneous measurements that cannot be ignored in most practical applications. Therefore, data users are always looking for efficient methods to substitute missing values with accurate estimations. Traditionally, empirical methods have been used for this purpose, but with the increasing accessibility and effectiveness of Machine Learning (ML) methods, it is plausible that the former will be replaced by the latter. In this study, we aimed to provide some insights on the state of this question using the network of meteorological stations installed and operated by the GIS Research Unit of the Agricultural University of Athens in Nemea, Greece as a test site for the estimation of daily average solar radiation. Routine weather parameters from ten stations in a period spanning 1,548 days were collected, curated, and used for the training, calibration, and validation of different iterations of two empirical equations and three iterations each of Random Forest (RF) and Recurrent Neural Networks (RNN). The results indicated that while ML methods, and especially RNNs, are in general more accurate than their empirical counterparts, the investment in technical knowledge, time, and processing capacity they require for their implementation cannot constitute them as a panacea, as such selection for the best method is case-sensitive. Future research directions could include the examination of more location-specific models or the integration of readily available spatiotemporal indicators to increase model generalization. |
format | Article |
id | doaj-art-281aaa82fc074d498200bbbfa7c94a19 |
institution | Kabale University |
issn | 2471-2132 |
language | English |
publishDate | 2024-12-01 |
publisher | AIMS Press |
record_format | Article |
series | AIMS Geosciences |
spelling | doaj-art-281aaa82fc074d498200bbbfa7c94a192025-01-24T01:13:56ZengAIMS PressAIMS Geosciences2471-21322024-12-0110493996410.3934/geosci.2024044Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictorsKonstantinos X Soulis0Evangelos E Nikitakis1Aikaterini N Katsogiannou2Dionissios P Kalivas3GIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceGIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceGIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceGIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceSensors are prone to malfunction, leading to blank or erroneous measurements that cannot be ignored in most practical applications. Therefore, data users are always looking for efficient methods to substitute missing values with accurate estimations. Traditionally, empirical methods have been used for this purpose, but with the increasing accessibility and effectiveness of Machine Learning (ML) methods, it is plausible that the former will be replaced by the latter. In this study, we aimed to provide some insights on the state of this question using the network of meteorological stations installed and operated by the GIS Research Unit of the Agricultural University of Athens in Nemea, Greece as a test site for the estimation of daily average solar radiation. Routine weather parameters from ten stations in a period spanning 1,548 days were collected, curated, and used for the training, calibration, and validation of different iterations of two empirical equations and three iterations each of Random Forest (RF) and Recurrent Neural Networks (RNN). The results indicated that while ML methods, and especially RNNs, are in general more accurate than their empirical counterparts, the investment in technical knowledge, time, and processing capacity they require for their implementation cannot constitute them as a panacea, as such selection for the best method is case-sensitive. Future research directions could include the examination of more location-specific models or the integration of readily available spatiotemporal indicators to increase model generalization.https://www.aimspress.com/article/doi/10.3934/geosci.2024044machine learningsolar radiationrecurrent neural networksrandom forestempirical methodsregression |
spellingShingle | Konstantinos X Soulis Evangelos E Nikitakis Aikaterini N Katsogiannou Dionissios P Kalivas Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors AIMS Geosciences machine learning solar radiation recurrent neural networks random forest empirical methods regression |
title | Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors |
title_full | Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors |
title_fullStr | Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors |
title_full_unstemmed | Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors |
title_short | Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors |
title_sort | examination of empirical and machine learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors |
topic | machine learning solar radiation recurrent neural networks random forest empirical methods regression |
url | https://www.aimspress.com/article/doi/10.3934/geosci.2024044 |
work_keys_str_mv | AT konstantinosxsoulis examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors AT evangelosenikitakis examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors AT aikaterininkatsogiannou examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors AT dionissiospkalivas examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors |