A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data

Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuni...

Full description

Saved in:

Bibliographic Details
Main Authors:	Abhay Bindle, Preeti Singla, Sachin Sharma, Abdukodir Khakimov, Reem Ibrahim Alkanhel, Ammar Muthanna
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	BM25 document ranking information retrieval large language models semantic similarity telecommunication
Online Access:	https://ieeexplore.ieee.org/document/11071302/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849387674336493568
author	Abhay Bindle Preeti Singla Sachin Sharma Abdukodir Khakimov Reem Ibrahim Alkanhel Ammar Muthanna
author_facet	Abhay Bindle Preeti Singla Sachin Sharma Abdukodir Khakimov Reem Ibrahim Alkanhel Ammar Muthanna
author_sort	Abhay Bindle
collection	DOAJ
description	Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuning, LLMs are able to carry out a variety of downstream activities in response to human instructions. This paper presents hybrid document retrieval and ranking approach that integrates statistical, probabilistic, and neural network-based retrieval models to enhance information retrieval performance in telecommunication domain. Traditional methods such as Term Frequency–Inverse Document Frequency (TF-IDF), and Best Match 25 (BM25) provide effective lexical matching, while deep learning-based models like Sentence-BERT (SBERT), and Word to Vector (Word2Vec) improve semantic understanding by capturing contextual relationships between query and document representations. The proposed framework introduces a novel multi-stage ranking mechanism that strategically integrates term-frequency-based scoring with semantic similarity modelling using Sentence-BERT and Word2Vec. Unlike existing models, our method dynamically adjusts weights across lexical and semantic components based on query features, enabling real-time adaptation for telecom-specific QA tasks. Performance evaluation is conducted using BLEU Score, ROUGE metrics, Cosine Similarity, and Word2Vec Similarity, demonstrating that the hybrid model outperforms conventional retrieval baselines in both precision and recall-oriented tasks. The proposed model effectively aligns query intent with retrieved documents, increase in efficiency of domain-specific search. The future scope includes dynamic embedding techniques to handle domain adaptation and attention-based ranking optimizations for long-form information retrieval. This research enhances information retrieval by combining machine learning-based ranking with traditional methods, improving knowledge discovery and decision-making in telecommunications and technical document processing.
format	Article
id	doaj-art-91ff1f68e6724f54acd9decb10d3869f
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-91ff1f68e6724f54acd9decb10d3869f2025-08-20T03:51:29ZengIEEEIEEE Access2169-35362025-01-011312034512035910.1109/ACCESS.2025.358563711071302A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication DataAbhay Bindle0https://orcid.org/0000-0003-0994-4246Preeti Singla1Sachin Sharma2https://orcid.org/0009-0001-8177-9225Abdukodir Khakimov3https://orcid.org/0000-0003-2362-3270Reem Ibrahim Alkanhel4https://orcid.org/0000-0001-6395-4723Ammar Muthanna5https://orcid.org/0000-0003-0213-8145ECE Department, MMDU, Mullana, IndiaCSE Department, MMDU, Mullana, IndiaState Bank of India, Panchkula, IndiaInstitute of Computer Science and Telecommunications, RUDN University, Moscow, RussiaDepartment of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi ArabiaInstitute of Computer Science and Telecommunications, RUDN University, Moscow, RussiaLarge language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuning, LLMs are able to carry out a variety of downstream activities in response to human instructions. This paper presents hybrid document retrieval and ranking approach that integrates statistical, probabilistic, and neural network-based retrieval models to enhance information retrieval performance in telecommunication domain. Traditional methods such as Term Frequency–Inverse Document Frequency (TF-IDF), and Best Match 25 (BM25) provide effective lexical matching, while deep learning-based models like Sentence-BERT (SBERT), and Word to Vector (Word2Vec) improve semantic understanding by capturing contextual relationships between query and document representations. The proposed framework introduces a novel multi-stage ranking mechanism that strategically integrates term-frequency-based scoring with semantic similarity modelling using Sentence-BERT and Word2Vec. Unlike existing models, our method dynamically adjusts weights across lexical and semantic components based on query features, enabling real-time adaptation for telecom-specific QA tasks. Performance evaluation is conducted using BLEU Score, ROUGE metrics, Cosine Similarity, and Word2Vec Similarity, demonstrating that the hybrid model outperforms conventional retrieval baselines in both precision and recall-oriented tasks. The proposed model effectively aligns query intent with retrieved documents, increase in efficiency of domain-specific search. The future scope includes dynamic embedding techniques to handle domain adaptation and attention-based ranking optimizations for long-form information retrieval. This research enhances information retrieval by combining machine learning-based ranking with traditional methods, improving knowledge discovery and decision-making in telecommunications and technical document processing.https://ieeexplore.ieee.org/document/11071302/BM25document rankinginformation retrievallarge language modelssemantic similaritytelecommunication
spellingShingle	Abhay Bindle Preeti Singla Sachin Sharma Abdukodir Khakimov Reem Ibrahim Alkanhel Ammar Muthanna A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data IEEE Access BM25 document ranking information retrieval large language models semantic similarity telecommunication
title	A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_full	A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_fullStr	A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_full_unstemmed	A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_short	A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_sort	hybrid large language model for context aware document ranking in telecommunication data
topic	BM25 document ranking information retrieval large language models semantic similarity telecommunication
url	https://ieeexplore.ieee.org/document/11071302/
work_keys_str_mv	AT abhaybindle ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT preetisingla ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT sachinsharma ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT abdukodirkhakimov ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT reemibrahimalkanhel ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT ammarmuthanna ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT abhaybindle hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT preetisingla hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT sachinsharma hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT abdukodirkhakimov hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT reemibrahimalkanhel hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata AT ammarmuthanna hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata

A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data

Similar Items