WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus

This paper addresses the challenges in extracting content words within the domains of natural language processing (NLP) and artificial intelligence (AI), using sustainable development goals (SDGs) corpora as verification examples. Traditional corpus-based methods and the term frequency-inverse docum...

Full description

Saved in:

Bibliographic Details
Main Authors:	Liang-Ching Chen, Kuei-Hu Chang, Jeng-Fung Hung
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Information
Subjects:	content word extraction natural language processing (NLP) artificial intelligence (AI) corpus the term frequency-inverse document frequency (TF-IDF) method the weighted aggregated sum product assessment (WASPAS) method
Online Access:	https://www.mdpi.com/2078-2489/16/3/198
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper addresses the challenges in extracting content words within the domains of natural language processing (NLP) and artificial intelligence (AI), using sustainable development goals (SDGs) corpora as verification examples. Traditional corpus-based methods and the term frequency-inverse document frequency (TF-IDF) method face limitations, including the inability to automatically eliminate function words, effectively extract the relevant parameters’ quantitative data, simultaneously consider frequency and range parameters to evaluate the terms’ overall importance, and sort content words at the corpus level. To overcome these limitations, this paper proposes a novel method based on a weighted aggregated sum product assessment (WASPAS) technique. This NLP method integrates the function word elimination method, an NLP machine, and the WASPAS technique to improve the extraction and ranking of content words. The proposed method efficiently extracts quantitative data, simultaneously considers frequency and range parameters to evaluate terms’ substantial importance, and ranks content words at the corpus level, providing a comprehensive overview of term significance. This study employed a target corpus from the Web of Science (WOS), comprising 35 highly cited SDG-related research articles. Compared to competing methods, the results demonstrate that the proposed method outperforms traditional methods in extracting and ranking content words.
ISSN:	2078-2489

WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus

Similar Items