Modality-based Modeling with Data Balancing and Dimensionality Reduction for Early Stunting Detection

In Indonesia, the stunting rate has reached 36%, significantly higher than the World Health Organization's (WHO) standard of 20%. This high prevalence underscores the urgent need for effective early detection methods. Traditional data mining approaches for stunting detection have primarily focu...

Full description

Saved in:
Bibliographic Details
Main Authors: Yohanes Setiawan, Mohammad Hamim Zajuli Al Faroby, Mochamad Nizar Palefi Ma’ady, I Made Wisnu Adi Sanjaya, Cisa Valentino Cahya Ramadhani
Format: Article
Language:English
Published: Department of Informatics, UIN Sunan Gunung Djati Bandung 2025-04-01
Series:JOIN: Jurnal Online Informatika
Subjects:
Online Access:https://join.if.uinsgd.ac.id/index.php/join/article/view/1495
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In Indonesia, the stunting rate has reached 36%, significantly higher than the World Health Organization's (WHO) standard of 20%. This high prevalence underscores the urgent need for effective early detection methods. Traditional data mining approaches for stunting detection have primarily focused on unimodal data, either tabular or image data alone, limiting the comprehensiveness and accuracy of the detection models. Modality-based modeling, which integrates image and tabular data, can provide a more holistic view and improve detection accuracy. This research aims to analyze modality-based modeling for the early detection of stunting. Two modalities, unimodal and multimodal, are used in this study. The main contributions of this research are the development of a comprehensive framework for modality-based analysis, the application of advanced data preprocessing techniques, and the comparison of various machine learning algorithms to identify the best model for stunting detection. The dataset, comprising images and tabular data, is sourced from Posyandu in Sidoarjo, Indonesia. Image data undergoes preprocessing, including background segmentation and feature extraction using the Gray Level Co-occurrence Matrix (GLCM), while tabular data is processed through categorical encoding. The Synthetic Minority Oversampling Technique (SMOTE) addresses class imbalance, and Principal Component Analysis (PCA) is used for dimensionality reduction. Unimodal modeling uses tabular or image data alone, while multimodal modeling combines both before classification. The study achieves the best F1 scores of 0.96, 0.91, and 0.90 for tabular-only, image-only, and image-tabular modalities, respectively, demonstrating the effectiveness of data balancing and dimensionality reduction techniques.
ISSN:2528-1682
2527-9165