Scaling down annotation needs: The capacity of self-supervised learning on diatom classification
Summary: In the field of life sciences, diatoms are essential biomarkers for assessing environmental health. Recent advancements in deep learning have transformed the traditionally laborious process of diatom classification through light microscopy. However, commonly used supervised learning methodo...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-04-01
|
| Series: | iScience |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2589004225004973 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Summary: In the field of life sciences, diatoms are essential biomarkers for assessing environmental health. Recent advancements in deep learning have transformed the traditionally laborious process of diatom classification through light microscopy. However, commonly used supervised learning methodologies necessitate annotated data, demanding the expertise of seasoned professionals. This study introduces self-supervised learning to tackle the challenge of scarce annotation in diatom classification. First, our results reveal that self-supervised pre-trained models considerably enhance the utilization effectiveness of available annotated data, with benefits increasing as the dataset size decreases. Second, fine-tuning our models with a very small labeled dataset (e.g., 50 samples per class) yields macro-average accuracy comparable to full-supervised levels, thereby reducing the reliance on taxonomic experts by approximately 96.0%. Moreover, extending the pre-training phase to 1600 epochs further reduced the dependency on annotations, achieving comparable accuracy with merely 30 samples per class. |
|---|---|
| ISSN: | 2589-0042 |