TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification

Recent advancements in graph-based text representation, particularly with embedding models and transformers such as BERT, have shown significant potential for enhancing natural language processing (NLP) tasks. However, challenges related to data sparsity and limited interpretability remain, especial...

Full description

Saved in:
Bibliographic Details
Main Authors: Carlos Sánchez-Antonio, José E. Valdez-Rodríguez, Hiram Calvo
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/12/22/3576
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850068395707334656
author Carlos Sánchez-Antonio
José E. Valdez-Rodríguez
Hiram Calvo
author_facet Carlos Sánchez-Antonio
José E. Valdez-Rodríguez
Hiram Calvo
author_sort Carlos Sánchez-Antonio
collection DOAJ
description Recent advancements in graph-based text representation, particularly with embedding models and transformers such as BERT, have shown significant potential for enhancing natural language processing (NLP) tasks. However, challenges related to data sparsity and limited interpretability remain, especially when working with small or imbalanced datasets. This paper introduces TTG-Text, a novel framework that strengthens graph-based text representation by integrating typical testors—a symbolic feature selection technique that refines feature importance while reducing dimensionality. Unlike traditional TF-IDF weighting, TTG-Text leverages typical testors to enhance feature relevance within text graphs, resulting in improved model interpretability and performance, particularly for smaller datasets. Our evaluation on a text classification task using a graph convolutional network (GCN) demonstrates that TTG-Text achieves a 95% accuracy rate, surpassing conventional methods and BERT with fewer required training epochs. By combining symbolic algorithms with graph-based models, this hybrid approach offers a more interpretable, efficient, and high-performing solution for complex NLP tasks.
format Article
id doaj-art-836ea1c5f16b413daa327d0f5cfc5d37
institution DOAJ
issn 2227-7390
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-836ea1c5f16b413daa327d0f5cfc5d372025-08-20T02:48:05ZengMDPI AGMathematics2227-73902024-11-011222357610.3390/math12223576TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved ClassificationCarlos Sánchez-Antonio0José E. Valdez-Rodríguez1Hiram Calvo2Cognitive Sciences Laboratory, Center for Computing Research, Instituto Politécnico Nacional, Mexico City 07738, MexicoCognitive Sciences Laboratory, Center for Computing Research, Instituto Politécnico Nacional, Mexico City 07738, MexicoCognitive Sciences Laboratory, Center for Computing Research, Instituto Politécnico Nacional, Mexico City 07738, MexicoRecent advancements in graph-based text representation, particularly with embedding models and transformers such as BERT, have shown significant potential for enhancing natural language processing (NLP) tasks. However, challenges related to data sparsity and limited interpretability remain, especially when working with small or imbalanced datasets. This paper introduces TTG-Text, a novel framework that strengthens graph-based text representation by integrating typical testors—a symbolic feature selection technique that refines feature importance while reducing dimensionality. Unlike traditional TF-IDF weighting, TTG-Text leverages typical testors to enhance feature relevance within text graphs, resulting in improved model interpretability and performance, particularly for smaller datasets. Our evaluation on a text classification task using a graph convolutional network (GCN) demonstrates that TTG-Text achieves a 95% accuracy rate, surpassing conventional methods and BERT with fewer required training epochs. By combining symbolic algorithms with graph-based models, this hybrid approach offers a more interpretable, efficient, and high-performing solution for complex NLP tasks.https://www.mdpi.com/2227-7390/12/22/3576graph-based text representationtypical testorstext classificationTF-IDFgraph convolutional networks (GCNs)natural language processing (NLP)
spellingShingle Carlos Sánchez-Antonio
José E. Valdez-Rodríguez
Hiram Calvo
TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification
Mathematics
graph-based text representation
typical testors
text classification
TF-IDF
graph convolutional networks (GCNs)
natural language processing (NLP)
title TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification
title_full TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification
title_fullStr TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification
title_full_unstemmed TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification
title_short TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification
title_sort ttg text a graph based text representation framework enhanced by typical testors for improved classification
topic graph-based text representation
typical testors
text classification
TF-IDF
graph convolutional networks (GCNs)
natural language processing (NLP)
url https://www.mdpi.com/2227-7390/12/22/3576
work_keys_str_mv AT carlossanchezantonio ttgtextagraphbasedtextrepresentationframeworkenhancedbytypicaltestorsforimprovedclassification
AT joseevaldezrodriguez ttgtextagraphbasedtextrepresentationframeworkenhancedbytypicaltestorsforimprovedclassification
AT hiramcalvo ttgtextagraphbasedtextrepresentationframeworkenhancedbytypicaltestorsforimprovedclassification