Evaluating search engines and large language models for answering health questions

Abstract Search engines (SEs) have traditionally been primary tools for information seeking, but the new large language models (LLMs) are emerging as powerful alternatives, particularly for question-answering tasks. This study compares the performance of four popular SEs, seven LLMs, and retrieval-a...

Full description

Saved in:
Bibliographic Details
Main Authors: Marcos Fernández-Pichel, Juan C. Pichel, David E. Losada
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-025-01546-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849774691739238400
author Marcos Fernández-Pichel
Juan C. Pichel
David E. Losada
author_facet Marcos Fernández-Pichel
Juan C. Pichel
David E. Losada
author_sort Marcos Fernández-Pichel
collection DOAJ
description Abstract Search engines (SEs) have traditionally been primary tools for information seeking, but the new large language models (LLMs) are emerging as powerful alternatives, particularly for question-answering tasks. This study compares the performance of four popular SEs, seven LLMs, and retrieval-augmented (RAG) variants in answering 150 health-related questions from the TREC Health Misinformation (HM) Track. Results reveal SEs correctly answer 50–70% of questions, often hindered by many retrieval results not responding to the health question. LLMs deliver higher accuracy, correctly answering about 80% of questions, though their performance is sensitive to input prompts. RAG methods significantly enhance smaller LLMs’ effectiveness, improving accuracy by up to 30% by integrating retrieval evidence.
format Article
id doaj-art-65b18ff1bea74f37bd945bd920a4a120
institution DOAJ
issn 2398-6352
language English
publishDate 2025-03-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj-art-65b18ff1bea74f37bd945bd920a4a1202025-08-20T03:01:38ZengNature Portfolionpj Digital Medicine2398-63522025-03-018111510.1038/s41746-025-01546-wEvaluating search engines and large language models for answering health questionsMarcos Fernández-Pichel0Juan C. Pichel1David E. Losada2Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Santiago de CompostelaCentro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Santiago de CompostelaCentro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Santiago de CompostelaAbstract Search engines (SEs) have traditionally been primary tools for information seeking, but the new large language models (LLMs) are emerging as powerful alternatives, particularly for question-answering tasks. This study compares the performance of four popular SEs, seven LLMs, and retrieval-augmented (RAG) variants in answering 150 health-related questions from the TREC Health Misinformation (HM) Track. Results reveal SEs correctly answer 50–70% of questions, often hindered by many retrieval results not responding to the health question. LLMs deliver higher accuracy, correctly answering about 80% of questions, though their performance is sensitive to input prompts. RAG methods significantly enhance smaller LLMs’ effectiveness, improving accuracy by up to 30% by integrating retrieval evidence.https://doi.org/10.1038/s41746-025-01546-w
spellingShingle Marcos Fernández-Pichel
Juan C. Pichel
David E. Losada
Evaluating search engines and large language models for answering health questions
npj Digital Medicine
title Evaluating search engines and large language models for answering health questions
title_full Evaluating search engines and large language models for answering health questions
title_fullStr Evaluating search engines and large language models for answering health questions
title_full_unstemmed Evaluating search engines and large language models for answering health questions
title_short Evaluating search engines and large language models for answering health questions
title_sort evaluating search engines and large language models for answering health questions
url https://doi.org/10.1038/s41746-025-01546-w
work_keys_str_mv AT marcosfernandezpichel evaluatingsearchenginesandlargelanguagemodelsforansweringhealthquestions
AT juancpichel evaluatingsearchenginesandlargelanguagemodelsforansweringhealthquestions
AT davidelosada evaluatingsearchenginesandlargelanguagemodelsforansweringhealthquestions