Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution
Marine mammal calls play a vital role in navigation, localization, and communication. Effectively classifying these calls is essential for ecological monitoring, species conservation, and military biomimetic applications. However, traditional machine learning methods struggle to capture complex acou...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-05-01
|
| Series: | Frontiers in Marine Science |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fmars.2025.1603090/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849715186144903168 |
|---|---|
| author | Wanlu Cheng Wanlu Cheng Hao Chen Jiaming Jiang Jiaming Jiang Shuang Li Shuang Li Jingjing Wang Jingjing Wang Yanping Zhou |
| author_facet | Wanlu Cheng Wanlu Cheng Hao Chen Jiaming Jiang Jiaming Jiang Shuang Li Shuang Li Jingjing Wang Jingjing Wang Yanping Zhou |
| author_sort | Wanlu Cheng |
| collection | DOAJ |
| description | Marine mammal calls play a vital role in navigation, localization, and communication. Effectively classifying these calls is essential for ecological monitoring, species conservation, and military biomimetic applications. However, traditional machine learning methods struggle to capture complex acoustic patterns, while most existing deep learning approaches rely solely on frequency-domain features and require large datasets, which limits their performance on small-scale marine mammal datasets. To address these challenges, we propose a hybrid architecture combining a time-attention Long Short-Term Memory (LSTM) network and a multi-scale dilated causal convolutional network. The model comprises three modules: (1) a frequency-domain feature extraction module employing dilated causal convolutions at multiple scales to capture multi-resolution spectral information from Mel spectrograms; (2) a time-domain feature extraction module that inputs Mel-frequency cepstral coefficients (MFCCs) into an LSTM enhanced with a time-attention mechanism to highlight key temporal features; and (3) a classification module leveraging transfer learning, where a pre-trained neural network is fine-tuned on real marine mammal call data to improve performance. Extensive experiments were conducted on vocalizations from four marine mammal species. Our proposed method outperformed existing baseline models across four evaluation metrics: accuracy, precision, recall, and F1 score, with improvements of 3%, 7%, 2%, and 4%, respectively. The results confirm the effectiveness of combining frequency- and time-domain features along with attention mechanisms and transfer learning. This hybrid approach enhances the accuracy and robustness of marine mammal call classification, especially under limited data conditions. |
| format | Article |
| id | doaj-art-71def9fcf78a4c1f8101869e12eebb5e |
| institution | DOAJ |
| issn | 2296-7745 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Marine Science |
| spelling | doaj-art-71def9fcf78a4c1f8101869e12eebb5e2025-08-20T03:13:29ZengFrontiers Media S.A.Frontiers in Marine Science2296-77452025-05-011210.3389/fmars.2025.16030901603090Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolutionWanlu Cheng0Wanlu Cheng1Hao Chen2Jiaming Jiang3Jiaming Jiang4Shuang Li5Shuang Li6Jingjing Wang7Jingjing Wang8Yanping Zhou9School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Mechanical Engineering, Ilmenau University of Technology, Ilmenau, GermanySchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaShandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, ChinaSchool of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, ChinaMarine mammal calls play a vital role in navigation, localization, and communication. Effectively classifying these calls is essential for ecological monitoring, species conservation, and military biomimetic applications. However, traditional machine learning methods struggle to capture complex acoustic patterns, while most existing deep learning approaches rely solely on frequency-domain features and require large datasets, which limits their performance on small-scale marine mammal datasets. To address these challenges, we propose a hybrid architecture combining a time-attention Long Short-Term Memory (LSTM) network and a multi-scale dilated causal convolutional network. The model comprises three modules: (1) a frequency-domain feature extraction module employing dilated causal convolutions at multiple scales to capture multi-resolution spectral information from Mel spectrograms; (2) a time-domain feature extraction module that inputs Mel-frequency cepstral coefficients (MFCCs) into an LSTM enhanced with a time-attention mechanism to highlight key temporal features; and (3) a classification module leveraging transfer learning, where a pre-trained neural network is fine-tuned on real marine mammal call data to improve performance. Extensive experiments were conducted on vocalizations from four marine mammal species. Our proposed method outperformed existing baseline models across four evaluation metrics: accuracy, precision, recall, and F1 score, with improvements of 3%, 7%, 2%, and 4%, respectively. The results confirm the effectiveness of combining frequency- and time-domain features along with attention mechanisms and transfer learning. This hybrid approach enhances the accuracy and robustness of marine mammal call classification, especially under limited data conditions.https://www.frontiersin.org/articles/10.3389/fmars.2025.1603090/fullmarine mammalsmarine mammal call recognition and classificationtransfer learningLSTMexpansive causal convolutional networks |
| spellingShingle | Wanlu Cheng Wanlu Cheng Hao Chen Jiaming Jiang Jiaming Jiang Shuang Li Shuang Li Jingjing Wang Jingjing Wang Yanping Zhou Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution Frontiers in Marine Science marine mammals marine mammal call recognition and classification transfer learning LSTM expansive causal convolutional networks |
| title | Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution |
| title_full | Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution |
| title_fullStr | Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution |
| title_full_unstemmed | Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution |
| title_short | Recognition and classification techniques of marine mammal calls based on LSTM and expanded causal convolution |
| title_sort | recognition and classification techniques of marine mammal calls based on lstm and expanded causal convolution |
| topic | marine mammals marine mammal call recognition and classification transfer learning LSTM expansive causal convolutional networks |
| url | https://www.frontiersin.org/articles/10.3389/fmars.2025.1603090/full |
| work_keys_str_mv | AT wanlucheng recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT wanlucheng recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT haochen recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT jiamingjiang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT jiamingjiang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT shuangli recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT shuangli recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT jingjingwang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT jingjingwang recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution AT yanpingzhou recognitionandclassificationtechniquesofmarinemammalcallsbasedonlstmandexpandedcausalconvolution |