CKSD: Comprehensive Kurdish-Sorani database

Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of text...

Full description

Saved in:
Bibliographic Details
Main Authors: Jihad Anwar Qadir, Samer Kais Jameel, Wshyar Omar Khudhur, Kamaran H. Manguri
Format: Article
Language:English
Published: Lublin University of Technology 2025-03-01
Series:Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska
Subjects:
Online Access:https://ph.pollub.pl/index.php/iapgos/article/view/6521
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849391633937727488
author Jihad Anwar Qadir
Samer Kais Jameel
Wshyar Omar Khudhur
Kamaran H. Manguri
author_facet Jihad Anwar Qadir
Samer Kais Jameel
Wshyar Omar Khudhur
Kamaran H. Manguri
author_sort Jihad Anwar Qadir
collection DOAJ
description Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of texts from journals, books, journals, research, and the internet. It is accomplished through database availability. Unfortunately, due to some reasons, Kurdish language databases may be rare or non-existent. In the present study, a Comprehensive Kurdish-Sorani Database (CKSD) is generated, which contains datasets of dates, letters, and common words in the Kurdish language, as well as the documents employed for the extraction of these datasets. Elements of these collections were extracted from the written documents in 27 different fonts. It bestows a comprehensiveness feature to the CKSD database that can be utilized by researchers. In order to determine the extent to which classifiers can categorize such data, these data were utilized in this study. Indeed, this study demonstrated the reliability of this data and its suitability for use in the fields of machine learning and other artificial intelligence applications.
format Article
id doaj-art-86fbb33a87de4be9b9102c152fc9037b
institution Kabale University
issn 2083-0157
2391-6761
language English
publishDate 2025-03-01
publisher Lublin University of Technology
record_format Article
series Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska
spelling doaj-art-86fbb33a87de4be9b9102c152fc9037b2025-08-20T03:41:00ZengLublin University of TechnologyInformatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska2083-01572391-67612025-03-0115110.35784/iapgos.6521CKSD: Comprehensive Kurdish-Sorani databaseJihad Anwar Qadir0https://orcid.org/0000-0003-3958-814XSamer Kais Jameel1https://orcid.org/0000-0003-2236-9303Wshyar Omar Khudhur2Kamaran H. Manguri3https://orcid.org/0000-0001-8567-3367University of Raparin, Software Engineering DepartmentUniversity of Raparin, Department of Computer ScienceErbil Polytechnic University, Koya Technical Institute, Department of Information TechnologyUniversity of Raparin, Software Engineering Department Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of texts from journals, books, journals, research, and the internet. It is accomplished through database availability. Unfortunately, due to some reasons, Kurdish language databases may be rare or non-existent. In the present study, a Comprehensive Kurdish-Sorani Database (CKSD) is generated, which contains datasets of dates, letters, and common words in the Kurdish language, as well as the documents employed for the extraction of these datasets. Elements of these collections were extracted from the written documents in 27 different fonts. It bestows a comprehensiveness feature to the CKSD database that can be utilized by researchers. In order to determine the extent to which classifiers can categorize such data, these data were utilized in this study. Indeed, this study demonstrated the reliability of this data and its suitability for use in the fields of machine learning and other artificial intelligence applications. https://ph.pollub.pl/index.php/iapgos/article/view/6521CKSD OCRfont recognitioncharacter recognitionfont style
spellingShingle Jihad Anwar Qadir
Samer Kais Jameel
Wshyar Omar Khudhur
Kamaran H. Manguri
CKSD: Comprehensive Kurdish-Sorani database
Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska
CKSD
OCR
font recognition
character recognition
font style
title CKSD: Comprehensive Kurdish-Sorani database
title_full CKSD: Comprehensive Kurdish-Sorani database
title_fullStr CKSD: Comprehensive Kurdish-Sorani database
title_full_unstemmed CKSD: Comprehensive Kurdish-Sorani database
title_short CKSD: Comprehensive Kurdish-Sorani database
title_sort cksd comprehensive kurdish sorani database
topic CKSD
OCR
font recognition
character recognition
font style
url https://ph.pollub.pl/index.php/iapgos/article/view/6521
work_keys_str_mv AT jihadanwarqadir cksdcomprehensivekurdishsoranidatabase
AT samerkaisjameel cksdcomprehensivekurdishsoranidatabase
AT wshyaromarkhudhur cksdcomprehensivekurdishsoranidatabase
AT kamaranhmanguri cksdcomprehensivekurdishsoranidatabase