The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection
Analyzing dialects in the Kurdish language proves to be tough because of the tiny phonetic distinctions among the dialects. We applied advanced methods to enhance the precision of Kurdish dialect classification in this research. We examined the dataset’s stability and variation through the use of t...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Koya University
2025-03-01
|
| Series: | ARO-The Scientific Journal of Koya University |
| Subjects: | |
| Online Access: | https://aro.koyauniversity.org/index.php/aro/article/view/1897 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850252085264646144 |
|---|---|
| author | Karzan J. Ghafoor Sarkhel H. Karim Karwan M. Hama Rawf Ayub O. Abdulrahman |
| author_facet | Karzan J. Ghafoor Sarkhel H. Karim Karwan M. Hama Rawf Ayub O. Abdulrahman |
| author_sort | Karzan J. Ghafoor |
| collection | DOAJ |
| description |
Analyzing dialects in the Kurdish language proves to be tough because of the tiny phonetic distinctions among the dialects. We applied advanced methods to enhance the precision of Kurdish dialect classification in this research. We examined the dataset’s stability and variation through the use of time-stretching and noise-augmenting methods. Analysis of variance (ANOVA) filter approach is applied to improve feature selection (FS) more efficiently and highlight the most relevant features for dialect
classification. The ANOVA filter method ranks features based on the means from different dialect groups, which made FS better. To make dialect classification work better, a 1D convolutional neural network model was given a dataset that had ANOVA FS added to it. The model showed a very strong performance, reaching a
remarkable accuracy of 99.42%. This noteworthy increase in accuracy beat former research with an accuracy of 95.5%. The findings demonstrate how combining time stretch and FS methods can improve the accuracy of Kurdish dialect classification. This project improves our understanding and implementation of machine learning in the field of linguistic diversity and dialectology.
|
| format | Article |
| id | doaj-art-b6d24b8a6f574bc0944a156270463259 |
| institution | OA Journals |
| issn | 2410-9355 2307-549X |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Koya University |
| record_format | Article |
| series | ARO-The Scientific Journal of Koya University |
| spelling | doaj-art-b6d24b8a6f574bc0944a1562704632592025-08-20T01:57:44ZengKoya UniversityARO-The Scientific Journal of Koya University2410-93552307-549X2025-03-0113110.14500/aro.11897The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature SelectionKarzan J. Ghafoor0Sarkhel H. Karim1Karwan M. Hama Rawf2Ayub O. Abdulrahman3Computer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. IraqComputer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. IraqComputer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. IraqComputer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. Iraq Analyzing dialects in the Kurdish language proves to be tough because of the tiny phonetic distinctions among the dialects. We applied advanced methods to enhance the precision of Kurdish dialect classification in this research. We examined the dataset’s stability and variation through the use of time-stretching and noise-augmenting methods. Analysis of variance (ANOVA) filter approach is applied to improve feature selection (FS) more efficiently and highlight the most relevant features for dialect classification. The ANOVA filter method ranks features based on the means from different dialect groups, which made FS better. To make dialect classification work better, a 1D convolutional neural network model was given a dataset that had ANOVA FS added to it. The model showed a very strong performance, reaching a remarkable accuracy of 99.42%. This noteworthy increase in accuracy beat former research with an accuracy of 95.5%. The findings demonstrate how combining time stretch and FS methods can improve the accuracy of Kurdish dialect classification. This project improves our understanding and implementation of machine learning in the field of linguistic diversity and dialectology. https://aro.koyauniversity.org/index.php/aro/article/view/18971D convolutional neural networkData augmentationFeature selectionKurdish dialect identificationSound feature |
| spellingShingle | Karzan J. Ghafoor Sarkhel H. Karim Karwan M. Hama Rawf Ayub O. Abdulrahman The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection ARO-The Scientific Journal of Koya University 1D convolutional neural network Data augmentation Feature selection Kurdish dialect identification Sound feature |
| title | The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection |
| title_full | The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection |
| title_fullStr | The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection |
| title_full_unstemmed | The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection |
| title_short | The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection |
| title_sort | improved kurdish dialect classification using data augmentation and anova based feature selection |
| topic | 1D convolutional neural network Data augmentation Feature selection Kurdish dialect identification Sound feature |
| url | https://aro.koyauniversity.org/index.php/aro/article/view/1897 |
| work_keys_str_mv | AT karzanjghafoor theimprovedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT sarkhelhkarim theimprovedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT karwanmhamarawf theimprovedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT ayuboabdulrahman theimprovedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT karzanjghafoor improvedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT sarkhelhkarim improvedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT karwanmhamarawf improvedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection AT ayuboabdulrahman improvedkurdishdialectclassificationusingdataaugmentationandanovabasedfeatureselection |