How many dollars are in the sea? Estimating sand dollar (Echinarachnius parma) abundance using an iteratively trained convolutional neural network and generalized additive models

The collection of imagery for population estimation is rapidly increasing. However, the number of images collected often exceeds the capacity to annotate these images manually. An alternative to human annotations is the use of computer vision to automatically count target organisms. Although the abi...

Full description

Saved in:
Bibliographic Details
Main Authors: Sara Vanaki, Deborah R. Hart, Jui-Han Chang
Format: Article
Language:English
Published: Elsevier 2025-12-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1574954125003206
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The collection of imagery for population estimation is rapidly increasing. However, the number of images collected often exceeds the capacity to annotate these images manually. An alternative to human annotations is the use of computer vision to automatically count target organisms. Although the abilities of automated detectors has been considerably improving through the use of convolutional neural networks (CNNs), several obstacles remain before these detectors can be used for population estimates.First, CNNs often require substantial training data in the form of labeled images that are often not readily available. We propose an iterative procedure where a CNN is trained on a limited set of images, and then applied to a new set of images. The new annotations are corrected and added to the training set to train an improved CNN. Crucially, this correction phase is efficient because one only has to make a binary decision per annotation instead of segmenting each individual instance. This procedure is repeated until a CNN with satisfactory performance is obtained.Secondly, output from the CNN needs to translated into population estimates. This is difficult since even well-trained CNNs make errors, and the confidence levels output by CNNs do not represent true probabilities. We employ Generalized Additive Models (GAMs) that use the confidence levels combined with covariates to estimate the probabilities that a detection is correct as well as the number of targets in an image. The image-based estimates can then be extrapolated to estimate population abundance and density.We apply these ideas to a set of 316,750 underwater images collected in the Mid-Atlantic Bight in 2015, in order to estimate the abundance and densities of sand dollars (Echinarachnius parma) in this area. We estimate that this area contains about 566 billion sand dollars, corresponding to a mean density of 16.0/m2.
ISSN:1574-9541