Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China
Accurate soil pH prediction is critical for soil management and ecological environmental protection. Machine learning (ML) models have been widely applied in the field of soil pH prediction. However, when using these models, the spatial heterogeneity of the relationship between soil and environmenta...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/6/1086 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849339768913002496 |
|---|---|
| author | Wantao Zhang Jingyi Ji Binbin Li Xiao Deng Mingxiang Xu |
| author_facet | Wantao Zhang Jingyi Ji Binbin Li Xiao Deng Mingxiang Xu |
| author_sort | Wantao Zhang |
| collection | DOAJ |
| description | Accurate soil pH prediction is critical for soil management and ecological environmental protection. Machine learning (ML) models have been widely applied in the field of soil pH prediction. However, when using these models, the spatial heterogeneity of the relationship between soil and environmental variables is often not fully considered, which limits the predictive capability of the models, especially in large-scale regions with complex soil landscapes. To address these challenges, this study collected soil pH data from 4335 soil surface points (0–20 cm) obtained from the China Soil System Survey, combined with a multi-source environmental covariate. This study integrates Geographic Weighted Regression (GWR) with three ML models (Random Forest, Cubist, and XGBoost) and designs and develops three geographically weighted machine learning models optimized by Genetic Algorithms to improve the prediction of soil pH values. Compared to GWR and traditional ML models, the R<sup>2</sup> of the geographic weighted random forest (GWRF), geographic weighted Cubist (GWCubist), and geographic weighted extreme gradient boosting (GWXGBoost) models increased by 1.98% to 14.29%, while the RMSE decreased by 1.81% to 11.98%. Among the three models, the GWRF model performed the best and effectively reduced uncertainty in soil pH mapping. Mean Annual Precipitation and the Normalized Difference Vegetation Index are two key environmental variables influencing the prediction of soil pH, and they have a significant negative impact on the spatial distribution of soil pH. These findings provide a scientific basis for effective soil health management and the implementation of large-scale soil modeling programs. |
| format | Article |
| id | doaj-art-3bec9f9cf7cd4acdaf270af00a2e4fec |
| institution | Kabale University |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-3bec9f9cf7cd4acdaf270af00a2e4fec2025-08-20T03:44:03ZengMDPI AGRemote Sensing2072-42922025-03-01176108610.3390/rs17061086Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in ChinaWantao Zhang0Jingyi Ji1Binbin Li2Xiao Deng3Mingxiang Xu4College of Soil and Water Conservation Science and Engineering, Northwest A&F University, Yangling 712100, ChinaState Key Laboratory of Soil Erosion and Dryland Farming on the Loess Plateau, The Research Center of Soil and Water Conservation and Ecological Environment, Chinese Academy of Sciences and Ministry of Education, Yangling 712100, ChinaCollege of Soil and Water Conservation Science and Engineering, Northwest A&F University, Yangling 712100, ChinaState Key Laboratory of Soil Erosion and Dryland Farming on the Loess Plateau, The Research Center of Soil and Water Conservation and Ecological Environment, Chinese Academy of Sciences and Ministry of Education, Yangling 712100, ChinaCollege of Soil and Water Conservation Science and Engineering, Northwest A&F University, Yangling 712100, ChinaAccurate soil pH prediction is critical for soil management and ecological environmental protection. Machine learning (ML) models have been widely applied in the field of soil pH prediction. However, when using these models, the spatial heterogeneity of the relationship between soil and environmental variables is often not fully considered, which limits the predictive capability of the models, especially in large-scale regions with complex soil landscapes. To address these challenges, this study collected soil pH data from 4335 soil surface points (0–20 cm) obtained from the China Soil System Survey, combined with a multi-source environmental covariate. This study integrates Geographic Weighted Regression (GWR) with three ML models (Random Forest, Cubist, and XGBoost) and designs and develops three geographically weighted machine learning models optimized by Genetic Algorithms to improve the prediction of soil pH values. Compared to GWR and traditional ML models, the R<sup>2</sup> of the geographic weighted random forest (GWRF), geographic weighted Cubist (GWCubist), and geographic weighted extreme gradient boosting (GWXGBoost) models increased by 1.98% to 14.29%, while the RMSE decreased by 1.81% to 11.98%. Among the three models, the GWRF model performed the best and effectively reduced uncertainty in soil pH mapping. Mean Annual Precipitation and the Normalized Difference Vegetation Index are two key environmental variables influencing the prediction of soil pH, and they have a significant negative impact on the spatial distribution of soil pH. These findings provide a scientific basis for effective soil health management and the implementation of large-scale soil modeling programs.https://www.mdpi.com/2072-4292/17/6/1086soil pHgeographically weighted machine learninggenetic algorithmuncertaintydigital soil mapping |
| spellingShingle | Wantao Zhang Jingyi Ji Binbin Li Xiao Deng Mingxiang Xu Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China Remote Sensing soil pH geographically weighted machine learning genetic algorithm uncertainty digital soil mapping |
| title | Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China |
| title_full | Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China |
| title_fullStr | Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China |
| title_full_unstemmed | Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China |
| title_short | Integrating Genetic Algorithm and Geographically Weighted Approaches into Machine Learning Improves Soil pH Prediction in China |
| title_sort | integrating genetic algorithm and geographically weighted approaches into machine learning improves soil ph prediction in china |
| topic | soil pH geographically weighted machine learning genetic algorithm uncertainty digital soil mapping |
| url | https://www.mdpi.com/2072-4292/17/6/1086 |
| work_keys_str_mv | AT wantaozhang integratinggeneticalgorithmandgeographicallyweightedapproachesintomachinelearningimprovessoilphpredictioninchina AT jingyiji integratinggeneticalgorithmandgeographicallyweightedapproachesintomachinelearningimprovessoilphpredictioninchina AT binbinli integratinggeneticalgorithmandgeographicallyweightedapproachesintomachinelearningimprovessoilphpredictioninchina AT xiaodeng integratinggeneticalgorithmandgeographicallyweightedapproachesintomachinelearningimprovessoilphpredictioninchina AT mingxiangxu integratinggeneticalgorithmandgeographicallyweightedapproachesintomachinelearningimprovessoilphpredictioninchina |