A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling
Identified as early as 2000, the challenges involved in developing and assessing remote sensing models with small datasets remain, with one key issue persisting: the misuse of random sampling to generate training and testing data. This practice often introduces a high degree of correlation between t...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/8/1373 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850180669904257024 |
|---|---|
| author | Kevin T. Decker Brett J. Borghetti |
| author_facet | Kevin T. Decker Brett J. Borghetti |
| author_sort | Kevin T. Decker |
| collection | DOAJ |
| description | Identified as early as 2000, the challenges involved in developing and assessing remote sensing models with small datasets remain, with one key issue persisting: the misuse of random sampling to generate training and testing data. This practice often introduces a high degree of correlation between the sets, leading to an overestimation of model generalizability. Despite the early recognition of this problem, few researchers have investigated its nuances or developed effective sampling techniques to address it. Our survey highlights that mitigation strategies to reduce this bias remain underutilized in practice, distorting the interpretation and comparison of results across the field. In this work, we introduce a set of desirable characteristics to evaluate sampling algorithms, with a primary focus on their tendency to induce correlation between training and test data, while also accounting for other relevant factors. Using these characteristics, we survey 146 articles, identify 16 unique sampling algorithms, and evaluate them. Our evaluation reveals two broad archetypes of sampling techniques that effectively mitigate correlation and are suitable for model development. |
| format | Article |
| id | doaj-art-c83d7ecc6c9c438997e0e99ce1ed0ca1 |
| institution | OA Journals |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-c83d7ecc6c9c438997e0e99ce1ed0ca12025-08-20T02:18:04ZengMDPI AGRemote Sensing2072-42922025-04-01178137310.3390/rs17081373A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random SamplingKevin T. Decker0Brett J. Borghetti1Air Force Institute of Technology, Department of Electrical and Computer Engineering, 2950 Hobson Way, Wright-Patterson AFB, OH 45433, USAAir Force Institute of Technology, Department of Electrical and Computer Engineering, 2950 Hobson Way, Wright-Patterson AFB, OH 45433, USAIdentified as early as 2000, the challenges involved in developing and assessing remote sensing models with small datasets remain, with one key issue persisting: the misuse of random sampling to generate training and testing data. This practice often introduces a high degree of correlation between the sets, leading to an overestimation of model generalizability. Despite the early recognition of this problem, few researchers have investigated its nuances or developed effective sampling techniques to address it. Our survey highlights that mitigation strategies to reduce this bias remain underutilized in practice, distorting the interpretation and comparison of results across the field. In this work, we introduce a set of desirable characteristics to evaluate sampling algorithms, with a primary focus on their tendency to induce correlation between training and test data, while also accounting for other relevant factors. Using these characteristics, we survey 146 articles, identify 16 unique sampling algorithms, and evaluate them. Our evaluation reveals two broad archetypes of sampling techniques that effectively mitigate correlation and are suitable for model development.https://www.mdpi.com/2072-4292/17/8/1373sampling algorithmgeneralizationmodel assessmentcorrelationremote sensing |
| spellingShingle | Kevin T. Decker Brett J. Borghetti A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling Remote Sensing sampling algorithm generalization model assessment correlation remote sensing |
| title | A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling |
| title_full | A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling |
| title_fullStr | A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling |
| title_full_unstemmed | A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling |
| title_short | A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling |
| title_sort | survey of sampling methods for hyperspectral remote sensing addressing bias induced by random sampling |
| topic | sampling algorithm generalization model assessment correlation remote sensing |
| url | https://www.mdpi.com/2072-4292/17/8/1373 |
| work_keys_str_mv | AT kevintdecker asurveyofsamplingmethodsforhyperspectralremotesensingaddressingbiasinducedbyrandomsampling AT brettjborghetti asurveyofsamplingmethodsforhyperspectralremotesensingaddressingbiasinducedbyrandomsampling AT kevintdecker surveyofsamplingmethodsforhyperspectralremotesensingaddressingbiasinducedbyrandomsampling AT brettjborghetti surveyofsamplingmethodsforhyperspectralremotesensingaddressingbiasinducedbyrandomsampling |