WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus

This paper addresses the challenges in extracting content words within the domains of natural language processing (NLP) and artificial intelligence (AI), using sustainable development goals (SDGs) corpora as verification examples. Traditional corpus-based methods and the term frequency-inverse docum...

Full description

Saved in:
Bibliographic Details
Main Authors: Liang-Ching Chen, Kuei-Hu Chang, Jeng-Fung Hung
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/3/198
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850203979724619776
author Liang-Ching Chen
Kuei-Hu Chang
Jeng-Fung Hung
author_facet Liang-Ching Chen
Kuei-Hu Chang
Jeng-Fung Hung
author_sort Liang-Ching Chen
collection DOAJ
description This paper addresses the challenges in extracting content words within the domains of natural language processing (NLP) and artificial intelligence (AI), using sustainable development goals (SDGs) corpora as verification examples. Traditional corpus-based methods and the term frequency-inverse document frequency (TF-IDF) method face limitations, including the inability to automatically eliminate function words, effectively extract the relevant parameters’ quantitative data, simultaneously consider frequency and range parameters to evaluate the terms’ overall importance, and sort content words at the corpus level. To overcome these limitations, this paper proposes a novel method based on a weighted aggregated sum product assessment (WASPAS) technique. This NLP method integrates the function word elimination method, an NLP machine, and the WASPAS technique to improve the extraction and ranking of content words. The proposed method efficiently extracts quantitative data, simultaneously considers frequency and range parameters to evaluate terms’ substantial importance, and ranks content words at the corpus level, providing a comprehensive overview of term significance. This study employed a target corpus from the Web of Science (WOS), comprising 35 highly cited SDG-related research articles. Compared to competing methods, the results demonstrate that the proposed method outperforms traditional methods in extracting and ranking content words.
format Article
id doaj-art-707b5b68f0d94c0d88becc38e994d3e9
institution OA Journals
issn 2078-2489
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-707b5b68f0d94c0d88becc38e994d3e92025-08-20T02:11:23ZengMDPI AGInformation2078-24892025-03-0116319810.3390/info16030198WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs CorpusLiang-Ching Chen0Kuei-Hu Chang1Jeng-Fung Hung2Department of Foreign Languages, R.O.C. Military Academy, Kaohsiung 830, TaiwanDepartment of Management Sciences, R.O.C. Military Academy, Kaohsiung 830, TaiwanGraduate Institute of Science Education and Environmental Education, National Kaohsiung Normal University, Kaohsiung 824, TaiwanThis paper addresses the challenges in extracting content words within the domains of natural language processing (NLP) and artificial intelligence (AI), using sustainable development goals (SDGs) corpora as verification examples. Traditional corpus-based methods and the term frequency-inverse document frequency (TF-IDF) method face limitations, including the inability to automatically eliminate function words, effectively extract the relevant parameters’ quantitative data, simultaneously consider frequency and range parameters to evaluate the terms’ overall importance, and sort content words at the corpus level. To overcome these limitations, this paper proposes a novel method based on a weighted aggregated sum product assessment (WASPAS) technique. This NLP method integrates the function word elimination method, an NLP machine, and the WASPAS technique to improve the extraction and ranking of content words. The proposed method efficiently extracts quantitative data, simultaneously considers frequency and range parameters to evaluate terms’ substantial importance, and ranks content words at the corpus level, providing a comprehensive overview of term significance. This study employed a target corpus from the Web of Science (WOS), comprising 35 highly cited SDG-related research articles. Compared to competing methods, the results demonstrate that the proposed method outperforms traditional methods in extracting and ranking content words.https://www.mdpi.com/2078-2489/16/3/198content word extractionnatural language processing (NLP)artificial intelligence (AI)corpusthe term frequency-inverse document frequency (TF-IDF) methodthe weighted aggregated sum product assessment (WASPAS) method
spellingShingle Liang-Ching Chen
Kuei-Hu Chang
Jeng-Fung Hung
WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus
Information
content word extraction
natural language processing (NLP)
artificial intelligence (AI)
corpus
the term frequency-inverse document frequency (TF-IDF) method
the weighted aggregated sum product assessment (WASPAS) method
title WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus
title_full WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus
title_fullStr WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus
title_full_unstemmed WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus
title_short WASPAS-Based Natural Language Processing Method for Handling Content Words Extraction and Ranking Issues: An Example of SDGs Corpus
title_sort waspas based natural language processing method for handling content words extraction and ranking issues an example of sdgs corpus
topic content word extraction
natural language processing (NLP)
artificial intelligence (AI)
corpus
the term frequency-inverse document frequency (TF-IDF) method
the weighted aggregated sum product assessment (WASPAS) method
url https://www.mdpi.com/2078-2489/16/3/198
work_keys_str_mv AT liangchingchen waspasbasednaturallanguageprocessingmethodforhandlingcontentwordsextractionandrankingissuesanexampleofsdgscorpus
AT kueihuchang waspasbasednaturallanguageprocessingmethodforhandlingcontentwordsextractionandrankingissuesanexampleofsdgscorpus
AT jengfunghung waspasbasednaturallanguageprocessingmethodforhandlingcontentwordsextractionandrankingissuesanexampleofsdgscorpus