A Multi-Scale Feature Fusion Hybrid Convolution Attention Model for Birdsong Recognition

Birdsong is a valuable indicator of rich biodiversity and ecological significance. Although feature extraction has demonstrated satisfactory performance in classification, single-scale feature extraction methods may not fully capture the complexity of birdsong, potentially leading to suboptimal clas...

Full description

Saved in:
Bibliographic Details
Main Authors: Lianglian Gu, Guangzhi Di, Danju Lv, Yan Zhang, Yueyun Yu, Wei Li, Ziqian Wang
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/8/4595
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Birdsong is a valuable indicator of rich biodiversity and ecological significance. Although feature extraction has demonstrated satisfactory performance in classification, single-scale feature extraction methods may not fully capture the complexity of birdsong, potentially leading to suboptimal classification outcomes. The integration of multi-scale feature extraction and fusion enables the model to better handle scale variations, thereby enhancing its adaptability across different scales. To address this issue, we propose a multi-scale hybrid convolutional attention mechanism model (MUSCA). This method combines depthwise separable convolution and traditional convolution for feature extraction and incorporates self-attention and spatial attention mechanisms to refine spatial and channel features, thereby improving the effectiveness of multi-scale feature extraction. To further enhance multi-scale feature fusion, a layer-by-layer alignment feature fusion method is developed to establish a deeper correlation, thereby improving classification accuracy and robustness. Using the above method, we identified 20 bird species on three spectrograms, wavelet spectrogram, log-Mel spectrogram and log-spectrogram, with recognition rates of 93.79%, 96.97% and 95.44%, respectively. Compared with the resnet18 model, it increased by 3.26%, 1.88% and 3.09%, respectively. The results indicate that the MUSCA method proposed in this paper is competitive compared to recent and state-of-the-art methods.
ISSN:2076-3417