A Novel Bias-Adjusted Estimator Based on Synthetic Confusion Matrix (BAESCM) for Subregion Area Estimation
Accurate area estimation of specific land cover/use types in administrative or natural units is crucial for various applications. However, land cover areas derived directly from classification maps of remote sensing via pixel counting often exhibit non-negligible bias. Thus, various design-based are...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/7/1145 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurate area estimation of specific land cover/use types in administrative or natural units is crucial for various applications. However, land cover areas derived directly from classification maps of remote sensing via pixel counting often exhibit non-negligible bias. Thus, various design-based area estimators (e.g., bias-adjusted estimator, model-assisted difference estimator, model-assisted ratio estimator derived from confusion matrix), which combine the information of ground truth samples and the classification map, have been applied to provide more accurate area estimates and the uncertainty inference. These estimators work well for estimating areas in a region with sufficient ground truth samples, whereas they encounter challenges when estimating areas in multiple subregions where the samples are limited within each subregion. To overcome this limitation, we propose a novel Bias-Adjusted Estimator based on the Synthetic Confusion Matrix (BAESCM) for estimating land cover areas in subregions by downscaling the global sample information to the subregion scale. First, several clusters were generated from remote sensing data through the K-means method (with the number of clusters being much smaller than the number of subregions). Then, the cluster confusion matrix is estimated based on the samples in each cluster. Assuming that the classification error distribution within each cluster remains consistent across different subregions, the confusion matrix of the subregion can be synthesized by a weighted sum of the cluster confusion matrices, with the weights of the cluster abundances in the subregion. Finally, the classification bias at the subregion scale can be estimated based on the synthetic confusion matrix, and the area counted from the classification map is corrected accordingly. Moreover, we introduced a semi-empirical method for inferring the confidence intervals of the estimated areas, considering both the sampling variance due to sampling randomness and the downscaling variance due to the heterogeneity in classification error distribution within the cluster. We tested our method through simulated experiments for county-level area estimation of soybean crops in Nebraska State, USA. The results show that the root mean square errors (RMSEs) of the subregion area estimates using BAESCM are reduced by 21–64% compared to estimates based on pixel counting from the classification map. Additionally, the true coverages of the confidence intervals estimated by our method approximately matched their nominal coverages. Compared with traditional design-based estimators, the proposed BAESCM achieves better estimation accuracy of subregion areas when the sample size is limited. Therefore, the proposed method is particularly recommended for studies regarding subregion land cover areas in the case of inadequate ground truth samples. |
|---|---|
| ISSN: | 2072-4292 |