Biomimetic Computing for Efficient Spoken Language Identification
Spoken Language Identification (SLID)-based applications have become increasingly important in everyday life, driven by advancements in artificial intelligence and machine learning. Multilingual countries utilize the SLID method to facilitate speech detection. This is accomplished by determining the...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Biomimetics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2313-7673/10/5/316 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849327520276545536 |
|---|---|
| author | Gaurav Kumar Saurabh Bhardwaj |
| author_facet | Gaurav Kumar Saurabh Bhardwaj |
| author_sort | Gaurav Kumar |
| collection | DOAJ |
| description | Spoken Language Identification (SLID)-based applications have become increasingly important in everyday life, driven by advancements in artificial intelligence and machine learning. Multilingual countries utilize the SLID method to facilitate speech detection. This is accomplished by determining the language of the spoken parts using language recognizers. On the other hand, when working with multilingual datasets, the presence of multiple languages that have a shared origin presents a significant challenge for accurately classifying languages using automatic techniques. Further, one more challenge is the significant variance in speech signals caused by factors such as different speakers, content, acoustic settings, language differences, changes in voice modulation based on age and gender, and variations in speech patterns. In this study, we introduce the DBODL-MSLIS approach, which integrates biomimetic optimization techniques inspired by natural intelligence to enhance language classification. The proposed method employs Dung Beetle Optimization (DBO) with Deep Learning, simulating the beetle’s foraging behavior to optimize feature selection and classification performance. The proposed technique integrates speech preprocessing, which encompasses pre-emphasis, windowing, and frame blocking, followed by feature extraction utilizing pitch, energy, Discrete Wavelet Transform (DWT), and Zero crossing rate (ZCR). Further, the selection of features is performed by DBO algorithm, which removes redundant features and helps to improve efficiency and accuracy. Spoken languages are classified using Bayesian optimization (BO) in conjunction with a long short-term memory (LSTM) network. The DBODL-MSLIS technique has been experimentally validated using the IIIT Spoken Language dataset. The results indicate an average accuracy of 95.54% and an F-score of 84.31%. This technique surpasses various other state-of-the-art models, such as SVM, MLP, LDA, DLA-ASLISS, HMHFS-IISLFAS, GA base fusion, and VGG-16. We have evaluated the accuracy of our proposed technique against state-of-the-art biomimetic computing models such as GA, PSO, GWO, DE, and ACO. While ACO achieved up to 89.45% accuracy, our Bayesian Optimization with LSTM outperformed all others, reaching a peak accuracy of 95.55%, demonstrating its effectiveness in enhancing spoken language identification. The suggested technique demonstrates promising potential for practical applications in the field of multi-lingual voice processing. |
| format | Article |
| id | doaj-art-e44d06f2ec80496e96a3b2b28d2b146a |
| institution | Kabale University |
| issn | 2313-7673 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Biomimetics |
| spelling | doaj-art-e44d06f2ec80496e96a3b2b28d2b146a2025-08-20T03:47:52ZengMDPI AGBiomimetics2313-76732025-05-0110531610.3390/biomimetics10050316Biomimetic Computing for Efficient Spoken Language IdentificationGaurav Kumar0Saurabh Bhardwaj1Department of Electrical and Instrumentation Engineering, Thapar Institute of Engineering and Technology, Patiala 147001, IndiaDepartment of Electrical and Instrumentation Engineering, Thapar Institute of Engineering and Technology, Patiala 147001, IndiaSpoken Language Identification (SLID)-based applications have become increasingly important in everyday life, driven by advancements in artificial intelligence and machine learning. Multilingual countries utilize the SLID method to facilitate speech detection. This is accomplished by determining the language of the spoken parts using language recognizers. On the other hand, when working with multilingual datasets, the presence of multiple languages that have a shared origin presents a significant challenge for accurately classifying languages using automatic techniques. Further, one more challenge is the significant variance in speech signals caused by factors such as different speakers, content, acoustic settings, language differences, changes in voice modulation based on age and gender, and variations in speech patterns. In this study, we introduce the DBODL-MSLIS approach, which integrates biomimetic optimization techniques inspired by natural intelligence to enhance language classification. The proposed method employs Dung Beetle Optimization (DBO) with Deep Learning, simulating the beetle’s foraging behavior to optimize feature selection and classification performance. The proposed technique integrates speech preprocessing, which encompasses pre-emphasis, windowing, and frame blocking, followed by feature extraction utilizing pitch, energy, Discrete Wavelet Transform (DWT), and Zero crossing rate (ZCR). Further, the selection of features is performed by DBO algorithm, which removes redundant features and helps to improve efficiency and accuracy. Spoken languages are classified using Bayesian optimization (BO) in conjunction with a long short-term memory (LSTM) network. The DBODL-MSLIS technique has been experimentally validated using the IIIT Spoken Language dataset. The results indicate an average accuracy of 95.54% and an F-score of 84.31%. This technique surpasses various other state-of-the-art models, such as SVM, MLP, LDA, DLA-ASLISS, HMHFS-IISLFAS, GA base fusion, and VGG-16. We have evaluated the accuracy of our proposed technique against state-of-the-art biomimetic computing models such as GA, PSO, GWO, DE, and ACO. While ACO achieved up to 89.45% accuracy, our Bayesian Optimization with LSTM outperformed all others, reaching a peak accuracy of 95.55%, demonstrating its effectiveness in enhancing spoken language identification. The suggested technique demonstrates promising potential for practical applications in the field of multi-lingual voice processing.https://www.mdpi.com/2313-7673/10/5/316spoken language identificationdung beetle optimizationlong short-term memoryBayesian optimizationdeep learningmulti-class spoken language identification |
| spellingShingle | Gaurav Kumar Saurabh Bhardwaj Biomimetic Computing for Efficient Spoken Language Identification Biomimetics spoken language identification dung beetle optimization long short-term memory Bayesian optimization deep learning multi-class spoken language identification |
| title | Biomimetic Computing for Efficient Spoken Language Identification |
| title_full | Biomimetic Computing for Efficient Spoken Language Identification |
| title_fullStr | Biomimetic Computing for Efficient Spoken Language Identification |
| title_full_unstemmed | Biomimetic Computing for Efficient Spoken Language Identification |
| title_short | Biomimetic Computing for Efficient Spoken Language Identification |
| title_sort | biomimetic computing for efficient spoken language identification |
| topic | spoken language identification dung beetle optimization long short-term memory Bayesian optimization deep learning multi-class spoken language identification |
| url | https://www.mdpi.com/2313-7673/10/5/316 |
| work_keys_str_mv | AT gauravkumar biomimeticcomputingforefficientspokenlanguageidentification AT saurabhbhardwaj biomimeticcomputingforefficientspokenlanguageidentification |