Handling Semantic Relationships for Classification of Sparse Text: A Review

The classification of sparse text, common in short or specialized content, is challenging for natural language processing. These challenges stem from high-dimensional data and scarce relevant features because sparse text can result from noisy, short, or contextually limited inputs. This paper review...

Full description

Saved in:
Bibliographic Details
Main Authors: Safuan, Ku Ruhana Ku-Mahamud
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Engineering Proceedings
Subjects:
Online Access:https://www.mdpi.com/2673-4591/84/1/61
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The classification of sparse text, common in short or specialized content, is challenging for natural language processing. These challenges stem from high-dimensional data and scarce relevant features because sparse text can result from noisy, short, or contextually limited inputs. This paper reviews approaches for handling semantic relationships in sparse text classification. Approaches like FastText and Latent Dirichlet Allocation are discussed for addressing feature sparsity while maintaining semantic integrity. Embedding techniques, such as Word2Vec and BERT, are crucial for capturing contextual meanings and improving accuracy. Recent advances include hybrid models that combine deep learning and traditional methods for better performance. These approaches work across various datasets, including social media and scientific publications. Finally, progress in using semantic relationships for sparse text classification is reviewed, and open challenges and future research directions are identified to better integrate semantic understanding in sparse text classification.
ISSN:2673-4591