Classification of Toraja, Batak and Ambon Languages using Decision Tree and Gradient Boost methods
With its rich diversity of ethnicities, cultures, races, and religions, Indonesia is one of the countries with the highest number of regional languages in the world. This linguistic diversity often leads to communication challenges, particularly when conveying information or engaging in textual conv...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | Indonesian |
| Published: |
Islamic University of Indragiri
2025-05-01
|
| Series: | Sistemasi: Jurnal Sistem Informasi |
| Subjects: | |
| Online Access: | https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/5100 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | With its rich diversity of ethnicities, cultures, races, and religions, Indonesia is one of the countries with the highest number of regional languages in the world. This linguistic diversity often leads to communication challenges, particularly when conveying information or engaging in textual conversations. This study aims to identify and classify the Toraja, Batak, and Ambon languages using machine learning-based computational methods. The techniques employed include Decision Tree and Gradient Boost algorithms to evaluate the accuracy of each model. The results demonstrate that both Decision Tree and Gradient Boost are effective in language identification, achieving accuracy rates above 77%. However, based on the confusion matrix analysis, the Gradient Boost method proved to be more effective, with an accuracy rate of 81.06%, compared to 78.39% achieved by the Decision Tree. These findings suggest that Gradient Boost offers better performance for classifying these regional languages. |
|---|---|
| ISSN: | 2302-8149 2540-9719 |