ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search

Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and...

Full description

Saved in:
Bibliographic Details
Main Authors: Guido Rocchietti, Cosimo Rulli, Franco Maria Nardini, Cristina Ioana Muntean, Raffaele Perego, Ophir Frieder
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10839752/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832584010178494464
author Guido Rocchietti
Cosimo Rulli
Franco Maria Nardini
Cristina Ioana Muntean
Raffaele Perego
Ophir Frieder
author_facet Guido Rocchietti
Cosimo Rulli
Franco Maria Nardini
Cristina Ioana Muntean
Raffaele Perego
Ophir Frieder
author_sort Guido Rocchietti
collection DOAJ
description Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches &#x2014; including zero, one, and few-shot methods &#x2014; with ChatGPT (e.g., <monospace>gpt-3.5-turbo</monospace>). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than <monospace>gpt-3.5-turbo</monospace>. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both <monospace>gpt-3.5-turbo</monospace> and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques.
format Article
id doaj-art-ffbc47f4af534a7ea6696b56680bf11f
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-ffbc47f4af534a7ea6696b56680bf11f2025-01-28T00:01:32ZengIEEEIEEE Access2169-35362025-01-0113152531527110.1109/ACCESS.2025.352974110839752ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational SearchGuido Rocchietti0https://orcid.org/0009-0004-9704-0662Cosimo Rulli1Franco Maria Nardini2https://orcid.org/0000-0003-3183-334XCristina Ioana Muntean3https://orcid.org/0000-0001-5265-1831Raffaele Perego4https://orcid.org/0000-0001-7189-4724Ophir Frieder5https://orcid.org/0000-0001-5076-8171ISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyGeorgetown University, Washington, DC, USALarge Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches &#x2014; including zero, one, and few-shot methods &#x2014; with ChatGPT (e.g., <monospace>gpt-3.5-turbo</monospace>). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than <monospace>gpt-3.5-turbo</monospace>. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both <monospace>gpt-3.5-turbo</monospace> and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques.https://ieeexplore.ieee.org/document/10839752/Conversational searchquery rewritinglarge language modelsinstruction-tuned LLMsfine-tuning
spellingShingle Guido Rocchietti
Cosimo Rulli
Franco Maria Nardini
Cristina Ioana Muntean
Raffaele Perego
Ophir Frieder
ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
IEEE Access
Conversational search
query rewriting
large language models
instruction-tuned LLMs
fine-tuning
title ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_full ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_fullStr ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_full_unstemmed ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_short ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_sort chatgpt versus modest large language models an extensive study on benefits and drawbacks for conversational search
topic Conversational search
query rewriting
large language models
instruction-tuned LLMs
fine-tuning
url https://ieeexplore.ieee.org/document/10839752/
work_keys_str_mv AT guidorocchietti chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch
AT cosimorulli chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch
AT francomarianardini chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch
AT cristinaioanamuntean chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch
AT raffaeleperego chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch
AT ophirfrieder chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch