ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search

Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Guido Rocchietti, Cosimo Rulli, Franco Maria Nardini, Cristina Ioana Muntean, Raffaele Perego, Ophir Frieder
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Conversational search query rewriting large language models instruction-tuned LLMs fine-tuning
Online Access:	https://ieeexplore.ieee.org/document/10839752/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832584010178494464
author	Guido Rocchietti Cosimo Rulli Franco Maria Nardini Cristina Ioana Muntean Raffaele Perego Ophir Frieder
author_facet	Guido Rocchietti Cosimo Rulli Franco Maria Nardini Cristina Ioana Muntean Raffaele Perego Ophir Frieder
author_sort	Guido Rocchietti
collection	DOAJ
description	Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches — including zero, one, and few-shot methods — with ChatGPT (e.g., <monospace>gpt-3.5-turbo</monospace>). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than <monospace>gpt-3.5-turbo</monospace>. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both <monospace>gpt-3.5-turbo</monospace> and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques.
format	Article
id	doaj-art-ffbc47f4af534a7ea6696b56680bf11f
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-ffbc47f4af534a7ea6696b56680bf11f2025-01-28T00:01:32ZengIEEEIEEE Access2169-35362025-01-0113152531527110.1109/ACCESS.2025.352974110839752ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational SearchGuido Rocchietti0https://orcid.org/0009-0004-9704-0662Cosimo Rulli1Franco Maria Nardini2https://orcid.org/0000-0003-3183-334XCristina Ioana Muntean3https://orcid.org/0000-0001-5265-1831Raffaele Perego4https://orcid.org/0000-0001-7189-4724Ophir Frieder5https://orcid.org/0000-0001-5076-8171ISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyISTI-CNR, Pisa, ItalyGeorgetown University, Washington, DC, USALarge Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches — including zero, one, and few-shot methods — with ChatGPT (e.g., <monospace>gpt-3.5-turbo</monospace>). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than <monospace>gpt-3.5-turbo</monospace>. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both <monospace>gpt-3.5-turbo</monospace> and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques.https://ieeexplore.ieee.org/document/10839752/Conversational searchquery rewritinglarge language modelsinstruction-tuned LLMsfine-tuning
spellingShingle	Guido Rocchietti Cosimo Rulli Franco Maria Nardini Cristina Ioana Muntean Raffaele Perego Ophir Frieder ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search IEEE Access Conversational search query rewriting large language models instruction-tuned LLMs fine-tuning
title	ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_full	ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_fullStr	ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_full_unstemmed	ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_short	ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search
title_sort	chatgpt versus modest large language models an extensive study on benefits and drawbacks for conversational search
topic	Conversational search query rewriting large language models instruction-tuned LLMs fine-tuning
url	https://ieeexplore.ieee.org/document/10839752/
work_keys_str_mv	AT guidorocchietti chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch AT cosimorulli chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch AT francomarianardini chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch AT cristinaioanamuntean chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch AT raffaeleperego chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch AT ophirfrieder chatgptversusmodestlargelanguagemodelsanextensivestudyonbenefitsanddrawbacksforconversationalsearch

ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search

Similar Items