A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data

Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuni...

Full description

Saved in:
Bibliographic Details
Main Authors: Abhay Bindle, Preeti Singla, Sachin Sharma, Abdukodir Khakimov, Reem Ibrahim Alkanhel, Ammar Muthanna
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11071302/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849387674336493568
author Abhay Bindle
Preeti Singla
Sachin Sharma
Abdukodir Khakimov
Reem Ibrahim Alkanhel
Ammar Muthanna
author_facet Abhay Bindle
Preeti Singla
Sachin Sharma
Abdukodir Khakimov
Reem Ibrahim Alkanhel
Ammar Muthanna
author_sort Abhay Bindle
collection DOAJ
description Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuning, LLMs are able to carry out a variety of downstream activities in response to human instructions. This paper presents hybrid document retrieval and ranking approach that integrates statistical, probabilistic, and neural network-based retrieval models to enhance information retrieval performance in telecommunication domain. Traditional methods such as Term Frequency–Inverse Document Frequency (TF-IDF), and Best Match 25 (BM25) provide effective lexical matching, while deep learning-based models like Sentence-BERT (SBERT), and Word to Vector (Word2Vec) improve semantic understanding by capturing contextual relationships between query and document representations. The proposed framework introduces a novel multi-stage ranking mechanism that strategically integrates term-frequency-based scoring with semantic similarity modelling using Sentence-BERT and Word2Vec. Unlike existing models, our method dynamically adjusts weights across lexical and semantic components based on query features, enabling real-time adaptation for telecom-specific QA tasks. Performance evaluation is conducted using BLEU Score, ROUGE metrics, Cosine Similarity, and Word2Vec Similarity, demonstrating that the hybrid model outperforms conventional retrieval baselines in both precision and recall-oriented tasks. The proposed model effectively aligns query intent with retrieved documents, increase in efficiency of domain-specific search. The future scope includes dynamic embedding techniques to handle domain adaptation and attention-based ranking optimizations for long-form information retrieval. This research enhances information retrieval by combining machine learning-based ranking with traditional methods, improving knowledge discovery and decision-making in telecommunications and technical document processing.
format Article
id doaj-art-91ff1f68e6724f54acd9decb10d3869f
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-91ff1f68e6724f54acd9decb10d3869f2025-08-20T03:51:29ZengIEEEIEEE Access2169-35362025-01-011312034512035910.1109/ACCESS.2025.358563711071302A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication DataAbhay Bindle0https://orcid.org/0000-0003-0994-4246Preeti Singla1Sachin Sharma2https://orcid.org/0009-0001-8177-9225Abdukodir Khakimov3https://orcid.org/0000-0003-2362-3270Reem Ibrahim Alkanhel4https://orcid.org/0000-0001-6395-4723Ammar Muthanna5https://orcid.org/0000-0003-0213-8145ECE Department, MMDU, Mullana, IndiaCSE Department, MMDU, Mullana, IndiaState Bank of India, Panchkula, IndiaInstitute of Computer Science and Telecommunications, RUDN University, Moscow, RussiaDepartment of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi ArabiaInstitute of Computer Science and Telecommunications, RUDN University, Moscow, RussiaLarge language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless prospects for the automation of numerous tasks in the telecommunication industry. Following pre-training and fine-tuning, LLMs are able to carry out a variety of downstream activities in response to human instructions. This paper presents hybrid document retrieval and ranking approach that integrates statistical, probabilistic, and neural network-based retrieval models to enhance information retrieval performance in telecommunication domain. Traditional methods such as Term Frequency–Inverse Document Frequency (TF-IDF), and Best Match 25 (BM25) provide effective lexical matching, while deep learning-based models like Sentence-BERT (SBERT), and Word to Vector (Word2Vec) improve semantic understanding by capturing contextual relationships between query and document representations. The proposed framework introduces a novel multi-stage ranking mechanism that strategically integrates term-frequency-based scoring with semantic similarity modelling using Sentence-BERT and Word2Vec. Unlike existing models, our method dynamically adjusts weights across lexical and semantic components based on query features, enabling real-time adaptation for telecom-specific QA tasks. Performance evaluation is conducted using BLEU Score, ROUGE metrics, Cosine Similarity, and Word2Vec Similarity, demonstrating that the hybrid model outperforms conventional retrieval baselines in both precision and recall-oriented tasks. The proposed model effectively aligns query intent with retrieved documents, increase in efficiency of domain-specific search. The future scope includes dynamic embedding techniques to handle domain adaptation and attention-based ranking optimizations for long-form information retrieval. This research enhances information retrieval by combining machine learning-based ranking with traditional methods, improving knowledge discovery and decision-making in telecommunications and technical document processing.https://ieeexplore.ieee.org/document/11071302/BM25document rankinginformation retrievallarge language modelssemantic similaritytelecommunication
spellingShingle Abhay Bindle
Preeti Singla
Sachin Sharma
Abdukodir Khakimov
Reem Ibrahim Alkanhel
Ammar Muthanna
A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
IEEE Access
BM25
document ranking
information retrieval
large language models
semantic similarity
telecommunication
title A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_full A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_fullStr A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_full_unstemmed A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_short A Hybrid Large Language Model for Context-Aware Document Ranking in Telecommunication Data
title_sort hybrid large language model for context aware document ranking in telecommunication data
topic BM25
document ranking
information retrieval
large language models
semantic similarity
telecommunication
url https://ieeexplore.ieee.org/document/11071302/
work_keys_str_mv AT abhaybindle ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT preetisingla ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT sachinsharma ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT abdukodirkhakimov ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT reemibrahimalkanhel ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT ammarmuthanna ahybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT abhaybindle hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT preetisingla hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT sachinsharma hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT abdukodirkhakimov hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT reemibrahimalkanhel hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata
AT ammarmuthanna hybridlargelanguagemodelforcontextawaredocumentrankingintelecommunicationdata