IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images
This study explores the use of machine learning for the automated classification of the ten most abundant groups of marine organisms (in the size range of 5–12 cm) plus marine snow found in the ecosystem of the U.S. east coast. Images used in this process were collected using a shadowgraph imaging s...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-01-01
|
| Series: | Oceans |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2673-1924/6/1/7 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850091317181284352 |
|---|---|
| author | Brittney Slocum Bradley Penta |
| author_facet | Brittney Slocum Bradley Penta |
| author_sort | Brittney Slocum |
| collection | DOAJ |
| description | This study explores the use of machine learning for the automated classification of the ten most abundant groups of marine organisms (in the size range of 5–12 cm) plus marine snow found in the ecosystem of the U.S. east coast. Images used in this process were collected using a shadowgraph imaging system on a towed, undulating platform capable of collecting continuous imagery over large spatiotemporal scales. As a large quantity (29,818,917) of images was collected, the task of locating and identifying all imaged organisms could not be efficiently achieved by human analysis alone. Several tows of data were collected off the coast of Delaware Bay. The resulting images were then cleaned, segmented into regions of interest (ROIs), and fed through three convolutional neural networks (CNNs): VGG-16, ResNet-50, and a custom model created to find more high-level features in this dataset. These three models were used in a Random Forest Classifier-based ensemble approach to reach the best identification fidelity. The networks were trained on a training set of 187,000 ROIs augmented with random rotations and pixel intensity thresholding to increase data variability and evaluated against two datasets. While the performance of each individual model is examined, the best approach is to use the ensemble, which performed with an F1-score of 98% and an area under the curve (AUC) of 99% on both test datasets while its accuracy, precision, and recall fluctuated between 97% and 98%. |
| format | Article |
| id | doaj-art-c9a2e908c1c6463aa12498d3f76861ed |
| institution | DOAJ |
| issn | 2673-1924 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Oceans |
| spelling | doaj-art-c9a2e908c1c6463aa12498d3f76861ed2025-08-20T02:42:24ZengMDPI AGOceans2673-19242025-01-0161710.3390/oceans6010007IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph ImagesBrittney Slocum0Bradley Penta1U.S. Naval Research Laboratory, Stennis Space Center, MS 39529, USAU.S. Naval Research Laboratory, Stennis Space Center, MS 39529, USAThis study explores the use of machine learning for the automated classification of the ten most abundant groups of marine organisms (in the size range of 5–12 cm) plus marine snow found in the ecosystem of the U.S. east coast. Images used in this process were collected using a shadowgraph imaging system on a towed, undulating platform capable of collecting continuous imagery over large spatiotemporal scales. As a large quantity (29,818,917) of images was collected, the task of locating and identifying all imaged organisms could not be efficiently achieved by human analysis alone. Several tows of data were collected off the coast of Delaware Bay. The resulting images were then cleaned, segmented into regions of interest (ROIs), and fed through three convolutional neural networks (CNNs): VGG-16, ResNet-50, and a custom model created to find more high-level features in this dataset. These three models were used in a Random Forest Classifier-based ensemble approach to reach the best identification fidelity. The networks were trained on a training set of 187,000 ROIs augmented with random rotations and pixel intensity thresholding to increase data variability and evaluated against two datasets. While the performance of each individual model is examined, the best approach is to use the ensemble, which performed with an F1-score of 98% and an area under the curve (AUC) of 99% on both test datasets while its accuracy, precision, and recall fluctuated between 97% and 98%.https://www.mdpi.com/2673-1924/6/1/7planktonmachine-learningshadowgraphocean ecosystemsneural-networks |
| spellingShingle | Brittney Slocum Bradley Penta IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images Oceans plankton machine-learning shadowgraph ocean ecosystems neural-networks |
| title | IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images |
| title_full | IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images |
| title_fullStr | IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images |
| title_full_unstemmed | IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images |
| title_short | IchthyNet: An Ensemble Method for the Classification of In Situ Marine Zooplankton Shadowgraph Images |
| title_sort | ichthynet an ensemble method for the classification of in situ marine zooplankton shadowgraph images |
| topic | plankton machine-learning shadowgraph ocean ecosystems neural-networks |
| url | https://www.mdpi.com/2673-1924/6/1/7 |
| work_keys_str_mv | AT brittneyslocum ichthynetanensemblemethodfortheclassificationofinsitumarinezooplanktonshadowgraphimages AT bradleypenta ichthynetanensemblemethodfortheclassificationofinsitumarinezooplanktonshadowgraphimages |