Automated curation of spatial metadata in environmental monitoring data

Spatial data accuracy in environmental monitoring is crucial for practical large-scale data analytics and the development of AI models. In this context, spatial data is metadata and faces the same challenges as any other metadata, like missing values, false or contradicting information, formatting p...

Full description

Saved in:
Bibliographic Details
Main Authors: İlhan Mutlu, Jörg Hackermüller, Jana Schor
Format: Article
Language:English
Published: Elsevier 2025-05-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1574954125000470
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spatial data accuracy in environmental monitoring is crucial for practical large-scale data analytics and the development of AI models. In this context, spatial data is metadata and faces the same challenges as any other metadata, like missing values, false or contradicting information, formatting problems of textual data and numbers, the usage of different languages, and more. These issues severely limit the usability of the data.With this study, we provide an automatic approach, CleanGeoStreamR, to resolve as many of these issues as possible for the spatially annotated environmental monitoring database. We substantially increased the quality of the spatial metadata and, therefore, the quantity of data points that can be used in large-scale data analytics and AI applications.Further, our goal is to raise awareness about the issues related to spatial metadata and promote the implementation of our concepts in other environmental monitoring data sources. Advanced understanding and the availability of automatic approaches like the presented method will substantially contribute to making environmental monitoring data FAIR and enhance its usability in the era of Big Data and AI.
ISSN:1574-9541