Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China

Accurate soil pH prediction is critical for soil management and ecological environmental protection. Machine learning (ML) models have been widely applied in the field of soil pH prediction. However, when using these models, the spatial heterogeneity of the relationship between soil and environmenta...

Full description

Saved in:
Bibliographic Details
Main Authors: Wantao Zhang, Jingyi Ji, Binbin Li, Xiao Deng, Mingxiang Xu
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/6/1086
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate soil pH prediction is critical for soil management and ecological environmental protection. Machine learning (ML) models have been widely applied in the field of soil pH prediction. However, when using these models, the spatial heterogeneity of the relationship between soil and environmental variables is often not fully considered, which limits the predictive capability of the models, especially in large-scale regions with complex soil landscapes. To address these challenges, this study collected soil pH data from 4335 soil surface points (0–20 cm) obtained from the China Soil System Survey, combined with a multi-source environmental covariate. This study integrates Geographic Weighted Regression (GWR) with three ML models (Random Forest, Cubist, and XGBoost) and designs and develops three geographically weighted machine learning models optimized by Genetic Algorithms to improve the prediction of soil pH values. Compared to GWR and traditional ML models, the R<sup>2</sup> of the geographic weighted random forest (GWRF), geographic weighted Cubist (GWCubist), and geographic weighted extreme gradient boosting (GWXGBoost) models increased by 1.98% to 14.29%, while the RMSE decreased by 1.81% to 11.98%. Among the three models, the GWRF model performed the best and effectively reduced uncertainty in soil pH mapping. Mean Annual Precipitation and the Normalized Difference Vegetation Index are two key environmental variables influencing the prediction of soil pH, and they have a significant negative impact on the spatial distribution of soil pH. These findings provide a scientific basis for effective soil health management and the implementation of large-scale soil modeling programs.
ISSN:2072-4292