Assessing feature importance for forecasting soil moisture in subarctic regions using gridded historical and forecasted climate data

Continuous monitoring of soil moisture (SM) is essential in precision agriculture for effective irrigation management. However, SM forecasting in subarctic environments remains relatively unexplored. In this study, we forecast SM at a 30-centimeter soil depth over a 7-day period using Random Forest...

Full description

Saved in:
Bibliographic Details
Main Authors: Mojtaba Saboori, Kedar Surendranath Ghag, Anandharuban Panchanathan, Epari Ritesh Patro, Ali Torabi Haghighi
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Geoderma
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0016706125001429
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Continuous monitoring of soil moisture (SM) is essential in precision agriculture for effective irrigation management. However, SM forecasting in subarctic environments remains relatively unexplored. In this study, we forecast SM at a 30-centimeter soil depth over a 7-day period using Random Forest (RF) model. Two scenarios were evaluated: (a) relying solely on historical data (HIST), and (b) using forecasted environmental data along with recent SM measurements to predict SM levels iteratively, integrating next-day forecasts with current SM data (FORENV). The input features included daily gridded climate data (air temperature-Tair, relative humidity-RH, wind speed-WS, precipitation-P, and reference evapotranspiration-ET0), soil-vegetation (SV) features (gridded soil temperature-Tsoil and Normalized Difference Vegetation Index-NDVI) and lagged SM values. These data were gathered from six sites under different land covers in subarctic regions (Finland-Tyrnava) over approximately two growing seasons (July or August 2022–September 2023), yielding about 430 daily observations per site. The analysis showed that FORENV outperformed HIST for up to four days into the forecast horizon, highlighting the value of including forecasted variables for improved accuracy during these initial lead times. Longer lead times proved more site-dependent, influenced by the stability of historical SM correlations. Pearson correlation and RF-based stepwise forward feature selection revealed that using only lagged SM data, or combining it with SV features, yielded the most accurate forecasts. For instance, at t + 7 and across all case studies combined, models incorporating LaggedSM_SV achieved the lowest RMSE (0.019 m3.m−3) and highest R2 (0.67), followed by All_inputs (RMSE: 0.022 m3.m−3, R2: 0.61), and LaggedSM (RMSE: 0.025 m3.m−3, R2: 0.46). Daily P and RH exhibited consistently low correlations with subsurface SM, likely due to near-saturated soil conditions in many subarctic sites that buffer infiltration and reduce immediate sensitivity to these parameters. Overall, our results demonstrate that robust SM forecasts can be achieved even with limited data, making this approach particularly valuable in subarctic regions with near-saturated soil conditions or other areas where climate and soil-vegetation data may be sparse.
ISSN:1872-6259