CKSD: Comprehensive Kurdish-Sorani database
Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of text...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Lublin University of Technology
2025-03-01
|
| Series: | Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska |
| Subjects: | |
| Online Access: | https://ph.pollub.pl/index.php/iapgos/article/view/6521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849391633937727488 |
|---|---|
| author | Jihad Anwar Qadir Samer Kais Jameel Wshyar Omar Khudhur Kamaran H. Manguri |
| author_facet | Jihad Anwar Qadir Samer Kais Jameel Wshyar Omar Khudhur Kamaran H. Manguri |
| author_sort | Jihad Anwar Qadir |
| collection | DOAJ |
| description |
Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of texts from journals, books, journals, research, and the internet. It is accomplished through database availability. Unfortunately, due to some reasons, Kurdish language databases may be rare or non-existent. In the present study, a Comprehensive Kurdish-Sorani Database (CKSD) is generated, which contains datasets of dates, letters, and common words in the Kurdish language, as well as the documents employed for the extraction of these datasets. Elements of these collections were extracted from the written documents in 27 different fonts. It bestows a comprehensiveness feature to the CKSD database that can be utilized by researchers. In order to determine the extent to which classifiers can categorize such data, these data were utilized in this study. Indeed, this study demonstrated the reliability of this data and its suitability for use in the fields of machine learning and other artificial intelligence applications.
|
| format | Article |
| id | doaj-art-86fbb33a87de4be9b9102c152fc9037b |
| institution | Kabale University |
| issn | 2083-0157 2391-6761 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Lublin University of Technology |
| record_format | Article |
| series | Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska |
| spelling | doaj-art-86fbb33a87de4be9b9102c152fc9037b2025-08-20T03:41:00ZengLublin University of TechnologyInformatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska2083-01572391-67612025-03-0115110.35784/iapgos.6521CKSD: Comprehensive Kurdish-Sorani databaseJihad Anwar Qadir0https://orcid.org/0000-0003-3958-814XSamer Kais Jameel1https://orcid.org/0000-0003-2236-9303Wshyar Omar Khudhur2Kamaran H. Manguri3https://orcid.org/0000-0001-8567-3367University of Raparin, Software Engineering DepartmentUniversity of Raparin, Department of Computer ScienceErbil Polytechnic University, Koya Technical Institute, Department of Information TechnologyUniversity of Raparin, Software Engineering Department Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of texts from journals, books, journals, research, and the internet. It is accomplished through database availability. Unfortunately, due to some reasons, Kurdish language databases may be rare or non-existent. In the present study, a Comprehensive Kurdish-Sorani Database (CKSD) is generated, which contains datasets of dates, letters, and common words in the Kurdish language, as well as the documents employed for the extraction of these datasets. Elements of these collections were extracted from the written documents in 27 different fonts. It bestows a comprehensiveness feature to the CKSD database that can be utilized by researchers. In order to determine the extent to which classifiers can categorize such data, these data were utilized in this study. Indeed, this study demonstrated the reliability of this data and its suitability for use in the fields of machine learning and other artificial intelligence applications. https://ph.pollub.pl/index.php/iapgos/article/view/6521CKSD OCRfont recognitioncharacter recognitionfont style |
| spellingShingle | Jihad Anwar Qadir Samer Kais Jameel Wshyar Omar Khudhur Kamaran H. Manguri CKSD: Comprehensive Kurdish-Sorani database Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska CKSD OCR font recognition character recognition font style |
| title | CKSD: Comprehensive Kurdish-Sorani database |
| title_full | CKSD: Comprehensive Kurdish-Sorani database |
| title_fullStr | CKSD: Comprehensive Kurdish-Sorani database |
| title_full_unstemmed | CKSD: Comprehensive Kurdish-Sorani database |
| title_short | CKSD: Comprehensive Kurdish-Sorani database |
| title_sort | cksd comprehensive kurdish sorani database |
| topic | CKSD OCR font recognition character recognition font style |
| url | https://ph.pollub.pl/index.php/iapgos/article/view/6521 |
| work_keys_str_mv | AT jihadanwarqadir cksdcomprehensivekurdishsoranidatabase AT samerkaisjameel cksdcomprehensivekurdishsoranidatabase AT wshyaromarkhudhur cksdcomprehensivekurdishsoranidatabase AT kamaranhmanguri cksdcomprehensivekurdishsoranidatabase |