Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy

Background: Both early researchers, such as new graduate students, and experienced researchers face the challenge of sifting through vast amounts of literature to find their needle in a haystack. This process can be time-consuming, tedious, or frustratingly unproductive. Methods: Using only abstract...

Full description

Saved in:
Bibliographic Details
Main Authors: Joëd Ngangmeni, Danda B. Rawat
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:AI
Subjects:
Online Access:https://www.mdpi.com/2673-2688/6/3/47
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849340756995604480
author Joëd Ngangmeni
Danda B. Rawat
author_facet Joëd Ngangmeni
Danda B. Rawat
author_sort Joëd Ngangmeni
collection DOAJ
description Background: Both early researchers, such as new graduate students, and experienced researchers face the challenge of sifting through vast amounts of literature to find their needle in a haystack. This process can be time-consuming, tedious, or frustratingly unproductive. Methods: Using only abstracts and titles of research articles, we compare three retrieval methods—Bibliographic Indexing/Databasing (BI/D), Retrieval-Augmented Generation (RAG), and Graph Retrieval-Augmented Generation (GraphRAG)—which reportedly offer promising solutions to these common challenges. We assess their performance using two sets of Large Language Model (LLM)-generated queries: one set of queries with context and the other set without context. Our study evaluates six sub-models—four from Light Retrieval-Augmented Generation (LightRAG) and two from Microsoft’s Graph Retrieval-Augmented Generation (MGRAG). We examine these sub-models across four key criteria—comprehensiveness, diversity, empowerment, and directness—as well as the overall combination of these factors. Results: After three separate experiments, we observe that MGRAG has a slight advantage over LightRAG, naïve RAG, and BI/D for answering queries that require a semantic understanding of our data pool. The results (displayed in grouped bar charts) provide clear and accessible comparisons to help researchers quickly make informed decisions on which method best suits their needs. Conclusions: Supplementing BI/D with RAG or GraphRAG pipelines would positively impact the way both beginners and experienced researchers find and parse through volumes of potentially relevant information.
format Article
id doaj-art-a0ff6d3468ae470693af294c5be0f580
institution Kabale University
issn 2673-2688
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series AI
spelling doaj-art-a0ff6d3468ae470693af294c5be0f5802025-08-20T03:43:50ZengMDPI AGAI2673-26882025-03-01634710.3390/ai6030047Swamped with Too Many Articles? GraphRAG Makes Getting Started EasyJoëd Ngangmeni0Danda B. Rawat1Department of Electrical Engineering and Computer Science, Howard University, 2400 6th St NW, Washington, DC 20059, USADepartment of Electrical Engineering and Computer Science, Howard University, 2400 6th St NW, Washington, DC 20059, USABackground: Both early researchers, such as new graduate students, and experienced researchers face the challenge of sifting through vast amounts of literature to find their needle in a haystack. This process can be time-consuming, tedious, or frustratingly unproductive. Methods: Using only abstracts and titles of research articles, we compare three retrieval methods—Bibliographic Indexing/Databasing (BI/D), Retrieval-Augmented Generation (RAG), and Graph Retrieval-Augmented Generation (GraphRAG)—which reportedly offer promising solutions to these common challenges. We assess their performance using two sets of Large Language Model (LLM)-generated queries: one set of queries with context and the other set without context. Our study evaluates six sub-models—four from Light Retrieval-Augmented Generation (LightRAG) and two from Microsoft’s Graph Retrieval-Augmented Generation (MGRAG). We examine these sub-models across four key criteria—comprehensiveness, diversity, empowerment, and directness—as well as the overall combination of these factors. Results: After three separate experiments, we observe that MGRAG has a slight advantage over LightRAG, naïve RAG, and BI/D for answering queries that require a semantic understanding of our data pool. The results (displayed in grouped bar charts) provide clear and accessible comparisons to help researchers quickly make informed decisions on which method best suits their needs. Conclusions: Supplementing BI/D with RAG or GraphRAG pipelines would positively impact the way both beginners and experienced researchers find and parse through volumes of potentially relevant information.https://www.mdpi.com/2673-2688/6/3/47GraphRAGLightRAGretrieval-augmented generationgraphlarge language modelLLM
spellingShingle Joëd Ngangmeni
Danda B. Rawat
Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
AI
GraphRAG
LightRAG
retrieval-augmented generation
graph
large language model
LLM
title Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
title_full Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
title_fullStr Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
title_full_unstemmed Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
title_short Swamped with Too Many Articles? GraphRAG Makes Getting Started Easy
title_sort swamped with too many articles graphrag makes getting started easy
topic GraphRAG
LightRAG
retrieval-augmented generation
graph
large language model
LLM
url https://www.mdpi.com/2673-2688/6/3/47
work_keys_str_mv AT joedngangmeni swampedwithtoomanyarticlesgraphragmakesgettingstartedeasy
AT dandabrawat swampedwithtoomanyarticlesgraphragmakesgettingstartedeasy