A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets

Soil datasets, including soil sample data and soil map products, often contain outliers that can lead to inaccurate modeling and analysis of various soil-related issues. Existing methods for identifying potential outliers in soil datasets rely on simple statistical approaches and tend to overlook th...

Full description

Saved in:
Bibliographic Details
Main Authors: Yongji Wang, Mingjun Yang, Meizi Wang, Jiayang Lv, Shuhao Yuan, Shaoqi Li, Zihan Wang, Jipeng Zhang, Qingwen Qi, Yanjun Ye
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:Geoderma
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0016706125000382
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832540437589524480
author Yongji Wang
Mingjun Yang
Meizi Wang
Jiayang Lv
Shuhao Yuan
Shaoqi Li
Zihan Wang
Jipeng Zhang
Qingwen Qi
Yanjun Ye
author_facet Yongji Wang
Mingjun Yang
Meizi Wang
Jiayang Lv
Shuhao Yuan
Shaoqi Li
Zihan Wang
Jipeng Zhang
Qingwen Qi
Yanjun Ye
author_sort Yongji Wang
collection DOAJ
description Soil datasets, including soil sample data and soil map products, often contain outliers that can lead to inaccurate modeling and analysis of various soil-related issues. Existing methods for identifying potential outliers in soil datasets rely on simple statistical approaches and tend to overlook the geographical characteristics of the soil. Local indicators of spatial association (LISA) can address this limitation by examining the local spatial structures inherent in soil data. However, distinguishing some outliers remains challenging because of the varying levels of heterogeneity across different soil regions. In this paper, we present a novel method for recognizing potential outliers through local heterogeneity enhancement, which is aimed at improving the quality of soil datasets. In this method, stratified soil variations are first balanced to mitigate the effects of spatial discrepancies in different soil regions. Second, local heterogeneity enhancement is conducted to modify the outlier scores associated with abnormal soils exhibiting low heterogeneity. Third, a frequency histogram of outlier scores is applied to determine a suitable threshold at which to recognize potential abnormal values in soil datasets. To validate the proposed method, it was compared with the LISA and box-plot methods. Simulation data and soil data were adopted in the experiment, incorporating two types of irregular points and spatially continuous surfaces. The comparative experiments demonstrated that the proposed method more effectively identifies potential outliers by analyzing and balancing the local spatial structure of the soil than traditional methods do. It can be concluded that local heterogeneity enhancement is beneficial for recognizing potential outliers in soil datasets.
format Article
id doaj-art-0a64170724dc4d15bb723e7305a945ec
institution Kabale University
issn 1872-6259
language English
publishDate 2025-02-01
publisher Elsevier
record_format Article
series Geoderma
spelling doaj-art-0a64170724dc4d15bb723e7305a945ec2025-02-05T04:30:59ZengElsevierGeoderma1872-62592025-02-01454117200A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasetsYongji Wang0Mingjun Yang1Meizi Wang2Jiayang Lv3Shuhao Yuan4Shaoqi Li5Zihan Wang6Jipeng Zhang7Qingwen Qi8Yanjun Ye9School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaSchool of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaCollege of Plant Protection, Henan Agricultural University, 450002 Zhengzhou, China; Corresponding author.School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaSchool of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaSchool of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaSchool of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaSchool of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, ChinaState Key Laboratory of Resources and Environmental Information System, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, ChinaSchool of Earth Science and Engineering, Hebei University of Engineering, 056038 Handan, ChinaSoil datasets, including soil sample data and soil map products, often contain outliers that can lead to inaccurate modeling and analysis of various soil-related issues. Existing methods for identifying potential outliers in soil datasets rely on simple statistical approaches and tend to overlook the geographical characteristics of the soil. Local indicators of spatial association (LISA) can address this limitation by examining the local spatial structures inherent in soil data. However, distinguishing some outliers remains challenging because of the varying levels of heterogeneity across different soil regions. In this paper, we present a novel method for recognizing potential outliers through local heterogeneity enhancement, which is aimed at improving the quality of soil datasets. In this method, stratified soil variations are first balanced to mitigate the effects of spatial discrepancies in different soil regions. Second, local heterogeneity enhancement is conducted to modify the outlier scores associated with abnormal soils exhibiting low heterogeneity. Third, a frequency histogram of outlier scores is applied to determine a suitable threshold at which to recognize potential abnormal values in soil datasets. To validate the proposed method, it was compared with the LISA and box-plot methods. Simulation data and soil data were adopted in the experiment, incorporating two types of irregular points and spatially continuous surfaces. The comparative experiments demonstrated that the proposed method more effectively identifies potential outliers by analyzing and balancing the local spatial structure of the soil than traditional methods do. It can be concluded that local heterogeneity enhancement is beneficial for recognizing potential outliers in soil datasets.http://www.sciencedirect.com/science/article/pii/S0016706125000382Digital soil mapping (DSM)Quality of soil datasetsSoil samplesSoil map productOutlier recognitionLocal heterogeneity enhancement
spellingShingle Yongji Wang
Mingjun Yang
Meizi Wang
Jiayang Lv
Shuhao Yuan
Shaoqi Li
Zihan Wang
Jipeng Zhang
Qingwen Qi
Yanjun Ye
A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
Geoderma
Digital soil mapping (DSM)
Quality of soil datasets
Soil samples
Soil map product
Outlier recognition
Local heterogeneity enhancement
title A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
title_full A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
title_fullStr A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
title_full_unstemmed A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
title_short A novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
title_sort novel potential outlier recognition approach considering local heterogeneity enhancement to improve the quality of soil datasets
topic Digital soil mapping (DSM)
Quality of soil datasets
Soil samples
Soil map product
Outlier recognition
Local heterogeneity enhancement
url http://www.sciencedirect.com/science/article/pii/S0016706125000382
work_keys_str_mv AT yongjiwang anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT mingjunyang anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT meiziwang anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT jiayanglv anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT shuhaoyuan anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT shaoqili anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT zihanwang anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT jipengzhang anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT qingwenqi anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT yanjunye anovelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT yongjiwang novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT mingjunyang novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT meiziwang novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT jiayanglv novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT shuhaoyuan novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT shaoqili novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT zihanwang novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT jipengzhang novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT qingwenqi novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets
AT yanjunye novelpotentialoutlierrecognitionapproachconsideringlocalheterogeneityenhancementtoimprovethequalityofsoildatasets