Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution

Marine mammal calls play a vital role in navigation, localization, and communication. Effectively classifying these calls is essential for ecological monitoring, species conservation, and military biomimetic applications. However, traditional machine learning methods struggle to capture complex acou...

Full description

Saved in:
Bibliographic Details
Main Authors: Wanlu Cheng, Hao Chen, Jiaming Jiang, Shuang Li, Jingjing Wang, Yanping Zhou
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Marine Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmars.2025.1603090/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849715186144903168
author Wanlu Cheng
Wanlu Cheng
Hao Chen
Jiaming Jiang
Jiaming Jiang
Shuang Li
Shuang Li
Jingjing Wang
Jingjing Wang
Yanping Zhou
author_facet Wanlu Cheng
Wanlu Cheng
Hao Chen
Jiaming Jiang
Jiaming Jiang
Shuang Li
Shuang Li
Jingjing Wang
Jingjing Wang
Yanping Zhou
author_sort Wanlu Cheng
collection DOAJ
description Marine mammal calls play a vital role in navigation, localization, and communication. Effectively classifying these calls is essential for ecological monitoring, species conservation, and military biomimetic applications. However, traditional machine learning methods struggle to capture complex acoustic patterns, while most existing deep learning approaches rely solely on frequency-domain features and require large datasets, which limits their performance on small-scale marine mammal datasets. To address these challenges, we propose a hybrid architecture combining a time-attention Long Short-Term Memory (LSTM) network and a multi-scale dilated causal convolutional network. The model comprises three modules: (1) a frequency-domain feature extraction module employing dilated causal convolutions at multiple scales to capture multi-resolution spectral information from Mel spectrograms; (2) a time-domain feature extraction module that inputs Mel-frequency cepstral coefficients (MFCCs) into an LSTM enhanced with a time-attention mechanism to highlight key temporal features; and (3) a classification module leveraging transfer learning, where a pre-trained neural network is fine-tuned on real marine mammal call data to improve performance. Extensive experiments were conducted on vocalizations from four marine mammal species. Our proposed method outperformed existing baseline models across four evaluation metrics: accuracy, precision, recall, and F1 score, with improvements of 3%, 7%, 2%, and 4%, respectively. The results confirm the effectiveness of combining frequency- and time-domain features along with attention mechanisms and transfer learning. This hybrid approach enhances the accuracy and robustness of marine mammal call classification, especially under limited data conditions.
format Article
id doaj-art-71def9fcf78a4c1f8101869e12eebb5e
institution DOAJ
issn 2296-7745
language English
publishDate 2025-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Marine Science
spelling doaj-art-71def9fcf78a4c1f8101869e12eebb5e2025-08-20T03:13:29ZengFrontiers Media S.A.Frontiers in Marine Science2296-77452025-05-011210.3389/fmars.2025.16030901603090Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolutionWanlu Cheng0Wanlu Cheng1Hao Chen2Jiaming Jiang3Jiaming Jiang4Shuang Li5Shuang Li6Jingjing Wang7Jingjing Wang8Yanping Zhou9School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Mechanical Engineering, Ilmenau University of Technology, Ilmenau, GermanySchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaMarine mammal calls play a vital role in navigation, localization, and communication. Effectively classifying these calls is essential for ecological monitoring, species conservation, and military biomimetic applications. However, traditional machine learning methods struggle to capture complex acoustic patterns, while most existing deep learning approaches rely solely on frequency-domain features and require large datasets, which limits their performance on small-scale marine mammal datasets. To address these challenges, we propose a hybrid architecture combining a time-attention Long Short-Term Memory (LSTM) network and a multi-scale dilated causal convolutional network. The model comprises three modules: (1) a frequency-domain feature extraction module employing dilated causal convolutions at multiple scales to capture multi-resolution spectral information from Mel spectrograms; (2) a time-domain feature extraction module that inputs Mel-frequency cepstral coefficients (MFCCs) into an LSTM enhanced with a time-attention mechanism to highlight key temporal features; and (3) a classification module leveraging transfer learning, where a pre-trained neural network is fine-tuned on real marine mammal call data to improve performance. Extensive experiments were conducted on vocalizations from four marine mammal species. Our proposed method outperformed existing baseline models across four evaluation metrics: accuracy, precision, recall, and F1 score, with improvements of 3%, 7%, 2%, and 4%, respectively. The results confirm the effectiveness of combining frequency- and time-domain features along with attention mechanisms and transfer learning. This hybrid approach enhances the accuracy and robustness of marine mammal call classification, especially under limited data conditions.https://www.frontiersin.org/articles/10.3389/fmars.2025.1603090/fullmarine mammalsmarine mammal call recognition and classificationtransfer learningLSTMexpansive causal convolutional networks
spellingShingle Wanlu Cheng
Wanlu Cheng
Hao Chen
Jiaming Jiang
Jiaming Jiang
Shuang Li
Shuang Li
Jingjing Wang
Jingjing Wang
Yanping Zhou
Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
Frontiers in Marine Science
marine mammals
marine mammal call recognition and classification
transfer learning
LSTM
expansive causal convolutional networks
title Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
title_full Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
title_fullStr Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
title_full_unstemmed Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
title_short Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
title_sort recognition and classification techniques of marine mammal calls based on lstm and expanded causal convolution
topic marine mammals
marine mammal call recognition and classification
transfer learning
LSTM
expansive causal convolutional networks
url https://www.frontiersin.org/articles/10.3389/fmars.2025.1603090/full
work_keys_str_mv AT wanlucheng recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT wanlucheng recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT haochen recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT jiamingjiang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT jiamingjiang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT shuangli recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT shuangli recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT jingjingwang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT jingjingwang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution
AT yanpingzhou recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution