IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images

This study explores the use of machine learning for the automated classification of the ten most abundant groups of marine organisms (in the size range of 5–12 cm) plus marine snow found in the ecosystem of the U.S. east coast. Images used in this process were collected using a shadowgraph imaging s...

Full description

Saved in:
Bibliographic Details
Main Authors: Brittney Slocum, Bradley Penta
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Oceans
Subjects:
Online Access:https://www.mdpi.com/2673-1924/6/1/7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850091317181284352
author Brittney Slocum
Bradley Penta
author_facet Brittney Slocum
Bradley Penta
author_sort Brittney Slocum
collection DOAJ
description This study explores the use of machine learning for the automated classification of the ten most abundant groups of marine organisms (in the size range of 5–12 cm) plus marine snow found in the ecosystem of the U.S. east coast. Images used in this process were collected using a shadowgraph imaging system on a towed, undulating platform capable of collecting continuous imagery over large spatiotemporal scales. As a large quantity (29,818,917) of images was collected, the task of locating and identifying all imaged organisms could not be efficiently achieved by human analysis alone. Several tows of data were collected off the coast of Delaware Bay. The resulting images were then cleaned, segmented into regions of interest (ROIs), and fed through three convolutional neural networks (CNNs): VGG-16, ResNet-50, and a custom model created to find more high-level features in this dataset. These three models were used in a Random Forest Classifier-based ensemble approach to reach the best identification fidelity. The networks were trained on a training set of 187,000 ROIs augmented with random rotations and pixel intensity thresholding to increase data variability and evaluated against two datasets. While the performance of each individual model is examined, the best approach is to use the ensemble, which performed with an F1-score of 98% and an area under the curve (AUC) of 99% on both test datasets while its accuracy, precision, and recall fluctuated between 97% and 98%.
format Article
id doaj-art-c9a2e908c1c6463aa12498d3f76861ed
institution DOAJ
issn 2673-1924
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Oceans
spelling doaj-art-c9a2e908c1c6463aa12498d3f76861ed2025-08-20T02:42:24ZengMDPI AGOceans2673-19242025-01-0161710.3390/oceans6010007IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph ImagesBrittney Slocum0Bradley Penta1U.S. Naval Research Laboratory, Stennis Space Center, MS 39529, USAU.S. Naval Research Laboratory, Stennis Space Center, MS 39529, USAThis study explores the use of machine learning for the automated classification of the ten most abundant groups of marine organisms (in the size range of 5–12 cm) plus marine snow found in the ecosystem of the U.S. east coast. Images used in this process were collected using a shadowgraph imaging system on a towed, undulating platform capable of collecting continuous imagery over large spatiotemporal scales. As a large quantity (29,818,917) of images was collected, the task of locating and identifying all imaged organisms could not be efficiently achieved by human analysis alone. Several tows of data were collected off the coast of Delaware Bay. The resulting images were then cleaned, segmented into regions of interest (ROIs), and fed through three convolutional neural networks (CNNs): VGG-16, ResNet-50, and a custom model created to find more high-level features in this dataset. These three models were used in a Random Forest Classifier-based ensemble approach to reach the best identification fidelity. The networks were trained on a training set of 187,000 ROIs augmented with random rotations and pixel intensity thresholding to increase data variability and evaluated against two datasets. While the performance of each individual model is examined, the best approach is to use the ensemble, which performed with an F1-score of 98% and an area under the curve (AUC) of 99% on both test datasets while its accuracy, precision, and recall fluctuated between 97% and 98%.https://www.mdpi.com/2673-1924/6/1/7planktonmachine-learningshadowgraphocean ecosystemsneural-networks
spellingShingle Brittney Slocum
Bradley Penta
IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
Oceans
plankton
machine-learning
shadowgraph
ocean ecosystems
neural-networks
title IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
title_full IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
title_fullStr IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
title_full_unstemmed IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
title_short IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
title_sort ichthynet an ensemble method for the classification of in situ marine zooplankton shadowgraph images
topic plankton
machine-learning
shadowgraph
ocean ecosystems
neural-networks
url https://www.mdpi.com/2673-1924/6/1/7
work_keys_str_mv AT brittneyslocum ichthynetanensemblemethodfortheclassificationofinsitumarinezooplanktonshadowgraphimages
AT bradleypenta ichthynetanensemblemethodfortheclassificationofinsitumarinezooplanktonshadowgraphimages