Automatic training sample collection utilizing multi-source land cover products and time-series Sentinel-2 images

Collecting reliable training samples plays a crucial role in improving the accuracy of land cover (LC) mapping products, which are essential foundational data for global environmental and climate change research. However, the process is labor-intensive and time-consuming, as it heavily relies on hum...

Full description

Saved in:
Bibliographic Details
Main Authors: Yanzhao Wang, Yonghua Sun, Xuyue Cao, Yihan Wang, Wangkuan Zhang, Xinglu Cheng, Ruozeng Wang, Jinkun Zong
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:GIScience & Remote Sensing
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/15481603.2024.2352957
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Collecting reliable training samples plays a crucial role in improving the accuracy of land cover (LC) mapping products, which are essential foundational data for global environmental and climate change research. However, the process is labor-intensive and time-consuming, as it heavily relies on human interpretation. This article proposes an automatic training sample collection approach (ATSC) that utilizes multi-source LC products and time-series Sentinel-2 images. Firstly, a preliminary sample dataset was generated by fusing multiple LC products with the weighted majority voting (WMV) algorithm. Secondly, a locally selective combination in parallel outlier ensembles (LSCP) anomaly detection algorithm was applied to filter abnormal samples. The results revealed that (1) the China Land Cover Dataset (CLCD) had the highest overall accuracy (73.22%), and the ESRI Land Cover (ESRI) had the lowest overall accuracy (59.93%). Tree cover, built area, and water showed high accuracy across all products, while shrubland and wetland generally had low accuracy. (2) The average accuracy of the preliminary training samples for the four study areas was 95.62%. However, there were still abnormal samples, such as classification errors, LC changes within a year, and spectral anomalies. (3) Using the LSCP algorithm, 70.10% of the abnormal samples were removed, resulting in a final training sample accuracy that exceeded 97.87% in each region. The ATSC approach provides higher-quality training samples for LC classification and facilitates large-scale LC mapping initiatives.
ISSN:1548-1603
1943-7226