An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization

The rapid growth of scientific publications, coupled with rising retraction rates, has intensified the challenge of identifying trustworthy academic articles. To address this issue, we propose a three-layer ranking system that integrates natural language processing and machine learning techniques fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Chenhao Li, Jiyin Zhang, Weilin Chen, Xiaogang Ma
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/7/408
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850077179916845056
author Chenhao Li
Jiyin Zhang
Weilin Chen
Xiaogang Ma
author_facet Chenhao Li
Jiyin Zhang
Weilin Chen
Xiaogang Ma
author_sort Chenhao Li
collection DOAJ
description The rapid growth of scientific publications, coupled with rising retraction rates, has intensified the challenge of identifying trustworthy academic articles. To address this issue, we propose a three-layer ranking system that integrates natural language processing and machine learning techniques for relevance and trust assessment. First, we apply BERT-based embeddings to semantically match user queries with article content. Second, a Random Forest classifier is used to eliminate potentially problematic articles, leveraging features such as citation count, Altmetric score, and journal impact factor. Third, a custom ranking function combines relevance and trust indicators to score and sort the remaining articles. Evaluation using 16,052 articles from Retraction Watch and Web of Science datasets shows that our classifier achieves 90% accuracy and 97% recall for retracted articles. Citations emerged as the most influential trust signal (53.26%), followed by Altmetric and impact factors. This multi-layered approach offers a transparent and efficient alternative to conventional ranking algorithms, which can help researchers discover not only relevant but also reliable literature. Our system is adaptable to various domains and represents a promising tool for improving literature search and evaluation in the open science environment.
format Article
id doaj-art-677a0fd125924c9fab94b4eb86cc06af
institution DOAJ
issn 1999-4893
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj-art-677a0fd125924c9fab94b4eb86cc06af2025-08-20T02:45:52ZengMDPI AGAlgorithms1999-48932025-07-0118740810.3390/a18070408An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and OptimizationChenhao Li0Jiyin Zhang1Weilin Chen2Xiaogang Ma3Department of Computer Science, University of Idaho, Moscow, ID 83844, USADepartment of Computer Science, University of Idaho, Moscow, ID 83844, USADepartment of Computer Science, University of Idaho, Moscow, ID 83844, USADepartment of Computer Science, University of Idaho, Moscow, ID 83844, USAThe rapid growth of scientific publications, coupled with rising retraction rates, has intensified the challenge of identifying trustworthy academic articles. To address this issue, we propose a three-layer ranking system that integrates natural language processing and machine learning techniques for relevance and trust assessment. First, we apply BERT-based embeddings to semantically match user queries with article content. Second, a Random Forest classifier is used to eliminate potentially problematic articles, leveraging features such as citation count, Altmetric score, and journal impact factor. Third, a custom ranking function combines relevance and trust indicators to score and sort the remaining articles. Evaluation using 16,052 articles from Retraction Watch and Web of Science datasets shows that our classifier achieves 90% accuracy and 97% recall for retracted articles. Citations emerged as the most influential trust signal (53.26%), followed by Altmetric and impact factors. This multi-layered approach offers a transparent and efficient alternative to conventional ranking algorithms, which can help researchers discover not only relevant but also reliable literature. Our system is adaptable to various domains and represents a promising tool for improving literature search and evaluation in the open science environment.https://www.mdpi.com/1999-4893/18/7/408trustworthiness rankingsimilarity computationrecommendation systemmulti-layered factor analysisopen data
spellingShingle Chenhao Li
Jiyin Zhang
Weilin Chen
Xiaogang Ma
An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization
Algorithms
trustworthiness ranking
similarity computation
recommendation system
multi-layered factor analysis
open data
title An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization
title_full An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization
title_fullStr An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization
title_full_unstemmed An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization
title_short An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization
title_sort approach to trustworthy article ranking by nlp and multi layered analysis and optimization
topic trustworthiness ranking
similarity computation
recommendation system
multi-layered factor analysis
open data
url https://www.mdpi.com/1999-4893/18/7/408
work_keys_str_mv AT chenhaoli anapproachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT jiyinzhang anapproachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT weilinchen anapproachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT xiaogangma anapproachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT chenhaoli approachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT jiyinzhang approachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT weilinchen approachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization
AT xiaogangma approachtotrustworthyarticlerankingbynlpandmultilayeredanalysisandoptimization