COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION
The subject matter of the article are the neural network models designed or adapted for the problem of voice analysis in the context of the speaker identification and verification tasks. The goal of this work is to perform a comparative analysis of relevant neural network models in order to de...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Kharkiv National University of Radio Electronics
2023-08-01
|
| Series: | Сучасний стан наукових досліджень та технологій в промисловості |
| Subjects: | |
| Online Access: | https://itssi-journal.com/index.php/ittsi/article/view/400 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850111595139563520 |
|---|---|
| author | Vladyslav Kholiev Olesia Barkovska |
| author_facet | Vladyslav Kholiev Olesia Barkovska |
| author_sort | Vladyslav Kholiev |
| collection | DOAJ |
| description |
The subject matter of the article are the neural network models designed or adapted for the problem of voice analysis in the context of the speaker identification and verification tasks. The goal of this work is to perform a comparative analysis of relevant neural network models in order to determine the model(s) that best meet the chosen formulated criteria, – model type, programming language of model’s implementation, parallelizing potential, binary or multiclass, accuracy and computing complexity. Some of these criteria were chosen because of universal importance, regardless of particular application, such as accuracy and computational complexity. Others were chosen due to the architecture and challenges of the scientific communication system mentioned in the work that performs tasks of the speaker identification and verification. The relevance of the paper lies in the prevalence of audio as a communication medium, which results in a wide range of practical applications of audio intelligence in various fields of human activity (business, law, military), as well as in the necessity of enabling and encouraging efficient environment for inward-facing audio-based scientific communication among young scientists in order for them to accelerate their research and to acquire scientific communication skills. To achieve the goal, the following tasks were solved: criteria for models to be judged upon were formulated based on the needs and challenges of the proposed model; the models, designed for the problems of speaker identification and verification, according to formulated criteria were reviewed with the results compiled into a comprehensive table; optimal models were determined in accordance with the formulated criteria. The following neural network based models have been reviewed: SincNet, VGGVox, Jasper, TitaNet, SpeakerNet, ECAPA_TDNN. Conclusions. For the future research and practical solution of the problem of speaker authentication it will be reasonable to use a convolutional neural network implemented in the Python programming language, as it offers a wide variety of development tools and libraries to utilize.
|
| format | Article |
| id | doaj-art-29a976e00ec0463ca604af6031828b4a |
| institution | OA Journals |
| issn | 2522-9818 2524-2296 |
| language | English |
| publishDate | 2023-08-01 |
| publisher | Kharkiv National University of Radio Electronics |
| record_format | Article |
| series | Сучасний стан наукових досліджень та технологій в промисловості |
| spelling | doaj-art-29a976e00ec0463ca604af6031828b4a2025-08-20T02:37:36ZengKharkiv National University of Radio ElectronicsСучасний стан наукових досліджень та технологій в промисловості2522-98182524-22962023-08-012(24)10.30837/ITSSI.2023.24.172COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITIONVladyslav Kholiev0Olesia Barkovska1Kharkіv National University of Radio ElectronicsKharkіv National University of Radio Electronics The subject matter of the article are the neural network models designed or adapted for the problem of voice analysis in the context of the speaker identification and verification tasks. The goal of this work is to perform a comparative analysis of relevant neural network models in order to determine the model(s) that best meet the chosen formulated criteria, – model type, programming language of model’s implementation, parallelizing potential, binary or multiclass, accuracy and computing complexity. Some of these criteria were chosen because of universal importance, regardless of particular application, such as accuracy and computational complexity. Others were chosen due to the architecture and challenges of the scientific communication system mentioned in the work that performs tasks of the speaker identification and verification. The relevance of the paper lies in the prevalence of audio as a communication medium, which results in a wide range of practical applications of audio intelligence in various fields of human activity (business, law, military), as well as in the necessity of enabling and encouraging efficient environment for inward-facing audio-based scientific communication among young scientists in order for them to accelerate their research and to acquire scientific communication skills. To achieve the goal, the following tasks were solved: criteria for models to be judged upon were formulated based on the needs and challenges of the proposed model; the models, designed for the problems of speaker identification and verification, according to formulated criteria were reviewed with the results compiled into a comprehensive table; optimal models were determined in accordance with the formulated criteria. The following neural network based models have been reviewed: SincNet, VGGVox, Jasper, TitaNet, SpeakerNet, ECAPA_TDNN. Conclusions. For the future research and practical solution of the problem of speaker authentication it will be reasonable to use a convolutional neural network implemented in the Python programming language, as it offers a wide variety of development tools and libraries to utilize. https://itssi-journal.com/index.php/ittsi/article/view/400comparative analysis; neural network; intellectual models; model; machine learning; speaker identification; speaker recognition. |
| spellingShingle | Vladyslav Kholiev Olesia Barkovska COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION Сучасний стан наукових досліджень та технологій в промисловості comparative analysis; neural network; intellectual models; model; machine learning; speaker identification; speaker recognition. |
| title | COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION |
| title_full | COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION |
| title_fullStr | COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION |
| title_full_unstemmed | COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION |
| title_short | COMPARATIVE ANALYSIS OF NEURAL NETWORK MODELS FOR THE PROBLEM OF SPEAKER RECOGNITION |
| title_sort | comparative analysis of neural network models for the problem of speaker recognition |
| topic | comparative analysis; neural network; intellectual models; model; machine learning; speaker identification; speaker recognition. |
| url | https://itssi-journal.com/index.php/ittsi/article/view/400 |
| work_keys_str_mv | AT vladyslavkholiev comparativeanalysisofneuralnetworkmodelsfortheproblemofspeakerrecognition AT olesiabarkovska comparativeanalysisofneuralnetworkmodelsfortheproblemofspeakerrecognition |