Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links

Large language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather than just keywords have become increasingly realised. Expectations are now moving further: from a search query generating a list of "te...

Full description

Saved in:
Bibliographic Details
Main Author: Kent Fitch
Format: Article
Language:English
Published: Code4Lib 2023-08-01
Series:Code4Lib Journal
Online Access:https://journal.code4lib.org/articles/17443
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849731602088722432
author Kent Fitch
author_facet Kent Fitch
author_sort Kent Fitch
collection DOAJ
description Large language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather than just keywords have become increasingly realised. Expectations are now moving further: from a search query generating a list of "ten blue links" to producing an answer to a question, complete with citations. This article describes a proof-of-concept that applies the latest search technology to library collections by implementing a semantic search across a collection of 45,000 newspaper articles from the National Library of Australia's Trove repository, and using OpenAI's ChatGPT4 API to generate answers to questions on that collection that include source article citations. It also describes some techniques used to scale semantic search to a collection of 220 million articles.
format Article
id doaj-art-96ece397d5154bbda3dae56f1e1e6ea3
institution DOAJ
issn 1940-5758
language English
publishDate 2023-08-01
publisher Code4Lib
record_format Article
series Code4Lib Journal
spelling doaj-art-96ece397d5154bbda3dae56f1e1e6ea32025-08-20T03:08:31ZengCode4LibCode4Lib Journal1940-57582023-08-015717443Searching for Meaning Rather Than Keywords and Returning Answers Rather Than LinksKent FitchLarge language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather than just keywords have become increasingly realised. Expectations are now moving further: from a search query generating a list of "ten blue links" to producing an answer to a question, complete with citations. This article describes a proof-of-concept that applies the latest search technology to library collections by implementing a semantic search across a collection of 45,000 newspaper articles from the National Library of Australia's Trove repository, and using OpenAI's ChatGPT4 API to generate answers to questions on that collection that include source article citations. It also describes some techniques used to scale semantic search to a collection of 220 million articles.https://journal.code4lib.org/articles/17443
spellingShingle Kent Fitch
Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
Code4Lib Journal
title Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
title_full Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
title_fullStr Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
title_full_unstemmed Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
title_short Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
title_sort searching for meaning rather than keywords and returning answers rather than links
url https://journal.code4lib.org/articles/17443
work_keys_str_mv AT kentfitch searchingformeaningratherthankeywordsandreturninganswersratherthanlinks