News Classification using Natural Language Processing with TF-IDF and Multinomial Naïve Bayes

Online news contains valuable insights into public phenomena that can support statistical analysis by institutions like BPS Riau. However, current methods of classifying news are manual, time-consuming, and prone to human error. This study proposes an automated news classification system using Natu...

Full description

Saved in:
Bibliographic Details
Main Authors: Nadira Alifia Ionendri, Feri Candra, Afdi Rizal
Format: Article
Language:Indonesian
Published: Indonesian Society of Applied Science (ISAS) 2025-06-01
Series:Journal of Applied Computer Science and Technology
Subjects:
Online Access:https://journal.isas.or.id/index.php/JACOST/article/view/1099
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Online news contains valuable insights into public phenomena that can support statistical analysis by institutions like BPS Riau. However, current methods of classifying news are manual, time-consuming, and prone to human error. This study proposes an automated news classification system using Natural Language Processing (NLP) techniques with Term Frequency–Inverse Document Frequency (TF-IDF) for feature extraction and the Multinomial Naïve Bayes algorithm for classification. The dataset was collected via web scraping and manually labeled across five statistical categories: poverty, unemployment, democracy, inflation, and economic growth. The system achieved a validation accuracy of 83%, a test accuracy of 90%, with an average precision of 0.85, recall of 0.93, and f1-score of 0.87. These results demonstrate that the proposed system can significantly reduce the manual workload of news classification and be practically implemented by BPS Riau to support accurate and timely statistical reporting.
ISSN:2723-1453