A Comparative Analysis of Sentence Transformer Models for Automated Journal Recommendation Using PubMed Metadata

We present an automated journal recommendation pipeline designed to evaluate the performance of five Sentence Transformer models—all-mpnet-base-v2 (Mpnet), all-MiniLM-L6-v2 (Minilm-l6), all-MiniLM-L12-v2 (Minilm-l12), multi-qa-distilbert-cos-v1 (Multi-qa-distilbert), and all-distilroberta-v1 (robert...

Full description

Saved in:

Bibliographic Details
Main Authors:	Maria Teresa Colangelo, Marco Meleti, Stefano Guizzardi, Elena Calciolari, Carlo Galli
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Big Data and Cognitive Computing
Subjects:	automated journal recommendation KeyBERT PubMed search Sentence Transformers semantic similarity
Online Access:	https://www.mdpi.com/2504-2289/9/3/67
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We present an automated journal recommendation pipeline designed to evaluate the performance of five Sentence Transformer models—all-mpnet-base-v2 (Mpnet), all-MiniLM-L6-v2 (Minilm-l6), all-MiniLM-L12-v2 (Minilm-l12), multi-qa-distilbert-cos-v1 (Multi-qa-distilbert), and all-distilroberta-v1 (roberta)—for recommending journals aligned with a manuscript’s thematic scope. The pipeline extracts domain-relevant keywords from a manuscript via KeyBERT, retrieves potentially related articles from PubMed, and encodes both the test manuscript and retrieved articles into high-dimensional embeddings. By computing cosine similarity, it ranks relevant journals based on thematic overlap. Evaluations on 50 test articles highlight mpnet’s strong performance (mean similarity score 0.71 ± 0.04), albeit with higher computational demands. Minilm-l12 and minilm-l6 offer comparable precision at lower cost, while multi-qa-distilbert and roberta yield broader recommendations better suited to interdisciplinary research. These findings underscore key trade-offs among embedding models and demonstrate how they can provide interpretable, data-driven insights to guide journal selection across varied research contexts.
ISSN:	2504-2289

A Comparative Analysis of Sentence Transformer Models for Automated Journal Recommendation Using PubMed Metadata

Similar Items