Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors

Sensors are prone to malfunction, leading to blank or erroneous measurements that cannot be ignored in most practical applications. Therefore, data users are always looking for efficient methods to substitute missing values with accurate estimations. Traditionally, empirical methods have been used f...

Full description

Saved in:
Bibliographic Details
Main Authors: Konstantinos X Soulis, Evangelos E Nikitakis, Aikaterini N Katsogiannou, Dionissios P Kalivas
Format: Article
Language:English
Published: AIMS Press 2024-12-01
Series:AIMS Geosciences
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/geosci.2024044
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832590241352908800
author Konstantinos X Soulis
Evangelos E Nikitakis
Aikaterini N Katsogiannou
Dionissios P Kalivas
author_facet Konstantinos X Soulis
Evangelos E Nikitakis
Aikaterini N Katsogiannou
Dionissios P Kalivas
author_sort Konstantinos X Soulis
collection DOAJ
description Sensors are prone to malfunction, leading to blank or erroneous measurements that cannot be ignored in most practical applications. Therefore, data users are always looking for efficient methods to substitute missing values with accurate estimations. Traditionally, empirical methods have been used for this purpose, but with the increasing accessibility and effectiveness of Machine Learning (ML) methods, it is plausible that the former will be replaced by the latter. In this study, we aimed to provide some insights on the state of this question using the network of meteorological stations installed and operated by the GIS Research Unit of the Agricultural University of Athens in Nemea, Greece as a test site for the estimation of daily average solar radiation. Routine weather parameters from ten stations in a period spanning 1,548 days were collected, curated, and used for the training, calibration, and validation of different iterations of two empirical equations and three iterations each of Random Forest (RF) and Recurrent Neural Networks (RNN). The results indicated that while ML methods, and especially RNNs, are in general more accurate than their empirical counterparts, the investment in technical knowledge, time, and processing capacity they require for their implementation cannot constitute them as a panacea, as such selection for the best method is case-sensitive. Future research directions could include the examination of more location-specific models or the integration of readily available spatiotemporal indicators to increase model generalization.
format Article
id doaj-art-281aaa82fc074d498200bbbfa7c94a19
institution Kabale University
issn 2471-2132
language English
publishDate 2024-12-01
publisher AIMS Press
record_format Article
series AIMS Geosciences
spelling doaj-art-281aaa82fc074d498200bbbfa7c94a192025-01-24T01:13:56ZengAIMS PressAIMS Geosciences2471-21322024-12-0110493996410.3934/geosci.2024044Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictorsKonstantinos X Soulis0Evangelos E Nikitakis1Aikaterini N Katsogiannou2Dionissios P Kalivas3GIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceGIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceGIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceGIS Research Unit, Laboratory of Soil Science and Agricultural Chemistry, Department of Natural Resources Management and Agricultural Engineering, Agricultural University of Athens, Iera Odos 75, Athens, 11855, GreeceSensors are prone to malfunction, leading to blank or erroneous measurements that cannot be ignored in most practical applications. Therefore, data users are always looking for efficient methods to substitute missing values with accurate estimations. Traditionally, empirical methods have been used for this purpose, but with the increasing accessibility and effectiveness of Machine Learning (ML) methods, it is plausible that the former will be replaced by the latter. In this study, we aimed to provide some insights on the state of this question using the network of meteorological stations installed and operated by the GIS Research Unit of the Agricultural University of Athens in Nemea, Greece as a test site for the estimation of daily average solar radiation. Routine weather parameters from ten stations in a period spanning 1,548 days were collected, curated, and used for the training, calibration, and validation of different iterations of two empirical equations and three iterations each of Random Forest (RF) and Recurrent Neural Networks (RNN). The results indicated that while ML methods, and especially RNNs, are in general more accurate than their empirical counterparts, the investment in technical knowledge, time, and processing capacity they require for their implementation cannot constitute them as a panacea, as such selection for the best method is case-sensitive. Future research directions could include the examination of more location-specific models or the integration of readily available spatiotemporal indicators to increase model generalization.https://www.aimspress.com/article/doi/10.3934/geosci.2024044machine learningsolar radiationrecurrent neural networksrandom forestempirical methodsregression
spellingShingle Konstantinos X Soulis
Evangelos E Nikitakis
Aikaterini N Katsogiannou
Dionissios P Kalivas
Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
AIMS Geosciences
machine learning
solar radiation
recurrent neural networks
random forest
empirical methods
regression
title Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
title_full Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
title_fullStr Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
title_full_unstemmed Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
title_short Examination of empirical and Machine Learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
title_sort examination of empirical and machine learning methods for regression of missing or invalid solar radiation data using routine meteorological data as predictors
topic machine learning
solar radiation
recurrent neural networks
random forest
empirical methods
regression
url https://www.aimspress.com/article/doi/10.3934/geosci.2024044
work_keys_str_mv AT konstantinosxsoulis examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors
AT evangelosenikitakis examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors
AT aikaterininkatsogiannou examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors
AT dionissiospkalivas examinationofempiricalandmachinelearningmethodsforregressionofmissingorinvalidsolarradiationdatausingroutinemeteorologicaldataaspredictors