Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models

<p>Launched in 2020, the Korean Geostationary Environmental Monitoring Spectrometer (GEMS) is the first geostationary satellite mission for observing trace gas concentrations in the Earth's atmosphere. Observations are made over Asia. Geostationary orbits allow for hourly measurements, wh...

Full description

Saved in:
Bibliographic Details
Main Authors: J. Gödeke, A. Richter, K. Lange, P. Maaß, H. Hong, H. Lee, J. Park
Format: Article
Language:English
Published: Copernicus Publications 2025-08-01
Series:Atmospheric Measurement Techniques
Online Access:https://amt.copernicus.org/articles/18/3747/2025/amt-18-3747-2025.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<p>Launched in 2020, the Korean Geostationary Environmental Monitoring Spectrometer (GEMS) is the first geostationary satellite mission for observing trace gas concentrations in the Earth's atmosphere. Observations are made over Asia. Geostationary orbits allow for hourly measurements, which lead to a much higher temporal resolution compared to daily measurements taken from low-Earth orbits, such as by the TROPOspheric Monitoring Instrument (TROPOMI) or the Ozone Monitoring Instrument (OMI). This work estimates the hourly concentration of surface nitrogen dioxide (<span class="inline-formula">NO<sub>2</sub></span>) from GEMS tropospheric <span class="inline-formula">NO<sub>2</sub></span> vertical column densities (VCDs) and additional meteorological features, which serve as inputs for random forests and linear regression models. With several measurements per day, machine learning models can use not only current observations but also those from previous hours as inputs. We demonstrate that using these time-contiguous inputs leads to reliable improvements regarding all considered performance measures, such as Pearson correlation or mean square error. For random forests, the average performance gains are between 4.5 % and 7.5 %, depending on the performance measure. For linear regression models, average performance gains are between 7 % and 15 %. For performance evaluation, spatial cross-validation with surface in situ measurements is used to measure how well the trained models perform at locations where they have not received any training data. In other words, we inspect the models' ability to generalize to unseen locations. Additionally, we investigate the influence of tropospheric <span class="inline-formula">NO<sub>2</sub></span> VCDs on the performance. The region of our study is South Korea.</p>
ISSN:1867-1381
1867-8548