Transformer Models improve the acoustic recognition of buzz-pollinating bee species

Buzz-pollinated crops, such as tomatoes, potatoes, kiwifruit, and blueberries, are among the highest-yielding agricultural products. The flowers of these cultivated plants are characterized by having a specialized flower morphology with poricidal anthers that require vibration to achieve a full seed...

Full description

Saved in:
Bibliographic Details
Main Authors: Alef Iury Siqueira Ferreira, Nádia Felix Felipe da Silva, Fernanda Neiva Mesquita, Thierson Couto Rosa, Stephen L. Buchmann, José Neiva Mesquita-Neto
Format: Article
Language:English
Published: Elsevier 2025-05-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1574954125000196
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850230799922626560
author Alef Iury Siqueira Ferreira
Nádia Felix Felipe da Silva
Fernanda Neiva Mesquita
Thierson Couto Rosa
Stephen L. Buchmann
José Neiva Mesquita-Neto
author_facet Alef Iury Siqueira Ferreira
Nádia Felix Felipe da Silva
Fernanda Neiva Mesquita
Thierson Couto Rosa
Stephen L. Buchmann
José Neiva Mesquita-Neto
author_sort Alef Iury Siqueira Ferreira
collection DOAJ
description Buzz-pollinated crops, such as tomatoes, potatoes, kiwifruit, and blueberries, are among the highest-yielding agricultural products. The flowers of these cultivated plants are characterized by having a specialized flower morphology with poricidal anthers that require vibration to achieve a full seed set. At least 446 bee species, in 82 genera, use floral sonication (buzz pollination) to collect pollen grains as food. Identifying and classifying these diverse often look-alike bee species poses a challenge for taxonomists. Automated classification systems, based upon audible bee floral buzzes, have been investigated to meet this need. Recently, convolutional neural network (CNN) models have demonstrated superior performance in recognizing and distinguishing bee-buzzing sounds compared to classical Machine-Learning (ML) classifiers. Nonetheless, the performance of CNNs remains unsatisfactory and can be improved. Therefore, we applied a novel transformer-based neural network architecture for the task of acoustic recognition of blueberry-pollinating bee species. We further compared the performance of the Audio Spectrogram Transformer (AST) model and its variants, including Self-Supervised AST (SSAST) and Masked Autoencoding AST (MAE-AST), to that of strong baseline CNN models based on previous work, at the task of bee species recognition. We also employed data augmentation techniques and evaluated these models with a data set of bee sounds recorded during visits to blueberry flowers in Chile (518 audio samples of 15 bee species). Our results revealed that Transformer-based Neural Networks combined with pre-training and data augmentation outperformed CNN models (maximum F1-score: 64.5% ± 2; Accuracy: 82.2% ± 0.8). These innovative attention-based neural network architectures have demonstrated exceptional performance in assigning bee buzzing sounds to their respective taxonomic categories, outperforming prior deep learning models. However, transformer approaches face challenges related to small dataset size and class imbalance, similar to CNNs and classical ML algorithms. Combining pre-training with data augmentation is crucial to increase the diversity and robustness of training data sets for the acoustic recognition of bee species. We document the potential of transformer architectures to improve the performance of audible bee species identification, offering promising new avenues for bioacoustic research and pollination ecology.
format Article
id doaj-art-4b9e3922ee694d8b83c7b85cf04e064e
institution OA Journals
issn 1574-9541
language English
publishDate 2025-05-01
publisher Elsevier
record_format Article
series Ecological Informatics
spelling doaj-art-4b9e3922ee694d8b83c7b85cf04e064e2025-08-20T02:03:46ZengElsevierEcological Informatics1574-95412025-05-018610301010.1016/j.ecoinf.2025.103010Transformer Models improve the acoustic recognition of buzz-pollinating bee speciesAlef Iury Siqueira Ferreira0Nádia Felix Felipe da Silva1Fernanda Neiva Mesquita2Thierson Couto Rosa3Stephen L. Buchmann4José Neiva Mesquita-Neto5Universidade Federal de Goiás, Instituto de Informática, Goiânia, 74690-900, Goiás, BrazilUniversidade Federal de Goiás, Instituto de Informática, Goiânia, 74690-900, Goiás, BrazilUniversidade Federal de Goiás, Instituto de Informática, Goiânia, 74690-900, Goiás, BrazilUniversidade Federal de Goiás, Instituto de Informática, Goiânia, 74690-900, Goiás, BrazilDepartments of Entomology and Ecology and Evolutionary Biology, University of Arizona, Tucson, 85721, AZ, USALaboratorio Ecología de Abejas, Departamento de Biología y Química, Facultad de Ciencias Básicas, Universidad Católica del Maule, Talca, 3480112, Maule, Chile; Correspondence to: Avenida San Miguel 3696, Talca, Región del Maule, Chile.Buzz-pollinated crops, such as tomatoes, potatoes, kiwifruit, and blueberries, are among the highest-yielding agricultural products. The flowers of these cultivated plants are characterized by having a specialized flower morphology with poricidal anthers that require vibration to achieve a full seed set. At least 446 bee species, in 82 genera, use floral sonication (buzz pollination) to collect pollen grains as food. Identifying and classifying these diverse often look-alike bee species poses a challenge for taxonomists. Automated classification systems, based upon audible bee floral buzzes, have been investigated to meet this need. Recently, convolutional neural network (CNN) models have demonstrated superior performance in recognizing and distinguishing bee-buzzing sounds compared to classical Machine-Learning (ML) classifiers. Nonetheless, the performance of CNNs remains unsatisfactory and can be improved. Therefore, we applied a novel transformer-based neural network architecture for the task of acoustic recognition of blueberry-pollinating bee species. We further compared the performance of the Audio Spectrogram Transformer (AST) model and its variants, including Self-Supervised AST (SSAST) and Masked Autoencoding AST (MAE-AST), to that of strong baseline CNN models based on previous work, at the task of bee species recognition. We also employed data augmentation techniques and evaluated these models with a data set of bee sounds recorded during visits to blueberry flowers in Chile (518 audio samples of 15 bee species). Our results revealed that Transformer-based Neural Networks combined with pre-training and data augmentation outperformed CNN models (maximum F1-score: 64.5% ± 2; Accuracy: 82.2% ± 0.8). These innovative attention-based neural network architectures have demonstrated exceptional performance in assigning bee buzzing sounds to their respective taxonomic categories, outperforming prior deep learning models. However, transformer approaches face challenges related to small dataset size and class imbalance, similar to CNNs and classical ML algorithms. Combining pre-training with data augmentation is crucial to increase the diversity and robustness of training data sets for the acoustic recognition of bee species. We document the potential of transformer architectures to improve the performance of audible bee species identification, offering promising new avenues for bioacoustic research and pollination ecology.http://www.sciencedirect.com/science/article/pii/S1574954125000196Buzz-pollinated cropsEcosystem servicesCrop pollinationDeep learning
spellingShingle Alef Iury Siqueira Ferreira
Nádia Felix Felipe da Silva
Fernanda Neiva Mesquita
Thierson Couto Rosa
Stephen L. Buchmann
José Neiva Mesquita-Neto
Transformer Models improve the acoustic recognition of buzz-pollinating bee species
Ecological Informatics
Buzz-pollinated crops
Ecosystem services
Crop pollination
Deep learning
title Transformer Models improve the acoustic recognition of buzz-pollinating bee species
title_full Transformer Models improve the acoustic recognition of buzz-pollinating bee species
title_fullStr Transformer Models improve the acoustic recognition of buzz-pollinating bee species
title_full_unstemmed Transformer Models improve the acoustic recognition of buzz-pollinating bee species
title_short Transformer Models improve the acoustic recognition of buzz-pollinating bee species
title_sort transformer models improve the acoustic recognition of buzz pollinating bee species
topic Buzz-pollinated crops
Ecosystem services
Crop pollination
Deep learning
url http://www.sciencedirect.com/science/article/pii/S1574954125000196
work_keys_str_mv AT alefiurysiqueiraferreira transformermodelsimprovetheacousticrecognitionofbuzzpollinatingbeespecies
AT nadiafelixfelipedasilva transformermodelsimprovetheacousticrecognitionofbuzzpollinatingbeespecies
AT fernandaneivamesquita transformermodelsimprovetheacousticrecognitionofbuzzpollinatingbeespecies
AT thiersoncoutorosa transformermodelsimprovetheacousticrecognitionofbuzzpollinatingbeespecies
AT stephenlbuchmann transformermodelsimprovetheacousticrecognitionofbuzzpollinatingbeespecies
AT joseneivamesquitaneto transformermodelsimprovetheacousticrecognitionofbuzzpollinatingbeespecies