Fine-grained entity disambiguation through numeric pattern awareness in transformer models

Abstract Knowledge base question answering systems rely on entity linking to connect textual mentions in natural language with corresponding entities in a structured knowledge base. While conventional methods perform well in static or generic scenarios, they often struggle in dynamic contexts that r...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jaeeun Jang, Sangmin Kim, Howon Moon, Sang Heon Shin, Mira Yun, Charles Wiseman
Format:	Article
Language:	English
Published:	Springer 2025-05-01
Series:	Complex & Intelligent Systems
Subjects:	Entity Linking Knowledge Base Question Answering Instance-level Disambiguation Numerical Information Recognition Multi-task learning
Online Access:	https://doi.org/10.1007/s40747-025-01936-3
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850207338028335104
author	Jaeeun Jang Sangmin Kim Howon Moon Sang Heon Shin Mira Yun Charles Wiseman
author_facet	Jaeeun Jang Sangmin Kim Howon Moon Sang Heon Shin Mira Yun Charles Wiseman
author_sort	Jaeeun Jang
collection	DOAJ
description	Abstract Knowledge base question answering systems rely on entity linking to connect textual mentions in natural language with corresponding entities in a structured knowledge base. While conventional methods perform well in static or generic scenarios, they often struggle in dynamic contexts that require fine-grained disambiguation between specific instances. This limitation is particularly evident in domains such as military situation reports, where accurate identification of individual instances is essential for timely and precise information retrieval. This work introduces a novel framework for instance-level entity linking that enables transformer-based language models to distinguish between similarly described but distinct instances of the same entity. The proposed framework integrates three core innovations: (1) a context-aware masking strategy that reduces over-reliance on surface-level entity mentions by encouraging the model to leverage surrounding contextual cues, (2) a multi-task learning component that improves the model’s understanding of relative spatial and temporal information, and (3) a position-aware representation technique that restructures numerical data into token sequences that preserve structural integrity, addressing inherent limitations of subword tokenization in current language models. Experimental evaluations on simulated Korean military reports demonstrate substantial performance gains, particularly in challenging cases involving ambiguous or overlapping entity descriptions. The enhanced model shows a markedly improved ability to differentiate between instances based on subtle temporal and geospatial cues. These advancements extend beyond entity linking, offering broader insights into how transformer-based models can more effectively interpret numerical information. This research establishes a new direction for precise entity disambiguation in dynamic environments and contributes practical methods for enhancing language model performance across critical domains, including medical diagnostics, financial analysis, scientific research, autonomous systems, and numerical reasoning in smaller Large Language Models, where our techniques can potentially bridge the performance gap with larger models without requiring extensive computational resources.
format	Article
id	doaj-art-e854bc66e53940f8b762983b4a5e615d
institution	OA Journals
issn	2199-4536 2198-6053
language	English
publishDate	2025-05-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj-art-e854bc66e53940f8b762983b4a5e615d2025-08-20T02:10:32ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-05-0111712110.1007/s40747-025-01936-3Fine-grained entity disambiguation through numeric pattern awareness in transformer modelsJaeeun Jang0Sangmin Kim1Howon Moon2Sang Heon Shin3Mira Yun4Charles Wiseman5AI Research Team, Hanwha SystemsAI Research Team, Hanwha SystemsAI Research Team, Hanwha SystemsAI Research Team, Hanwha SystemsDepartment of Computer Science, Boston CollegeDepartment of Computer Science, Boston CollegeAbstract Knowledge base question answering systems rely on entity linking to connect textual mentions in natural language with corresponding entities in a structured knowledge base. While conventional methods perform well in static or generic scenarios, they often struggle in dynamic contexts that require fine-grained disambiguation between specific instances. This limitation is particularly evident in domains such as military situation reports, where accurate identification of individual instances is essential for timely and precise information retrieval. This work introduces a novel framework for instance-level entity linking that enables transformer-based language models to distinguish between similarly described but distinct instances of the same entity. The proposed framework integrates three core innovations: (1) a context-aware masking strategy that reduces over-reliance on surface-level entity mentions by encouraging the model to leverage surrounding contextual cues, (2) a multi-task learning component that improves the model’s understanding of relative spatial and temporal information, and (3) a position-aware representation technique that restructures numerical data into token sequences that preserve structural integrity, addressing inherent limitations of subword tokenization in current language models. Experimental evaluations on simulated Korean military reports demonstrate substantial performance gains, particularly in challenging cases involving ambiguous or overlapping entity descriptions. The enhanced model shows a markedly improved ability to differentiate between instances based on subtle temporal and geospatial cues. These advancements extend beyond entity linking, offering broader insights into how transformer-based models can more effectively interpret numerical information. This research establishes a new direction for precise entity disambiguation in dynamic environments and contributes practical methods for enhancing language model performance across critical domains, including medical diagnostics, financial analysis, scientific research, autonomous systems, and numerical reasoning in smaller Large Language Models, where our techniques can potentially bridge the performance gap with larger models without requiring extensive computational resources.https://doi.org/10.1007/s40747-025-01936-3Entity LinkingKnowledge Base Question AnsweringInstance-level DisambiguationNumerical Information RecognitionMulti-task learning
spellingShingle	Jaeeun Jang Sangmin Kim Howon Moon Sang Heon Shin Mira Yun Charles Wiseman Fine-grained entity disambiguation through numeric pattern awareness in transformer models Complex & Intelligent Systems Entity Linking Knowledge Base Question Answering Instance-level Disambiguation Numerical Information Recognition Multi-task learning
title	Fine-grained entity disambiguation through numeric pattern awareness in transformer models
title_full	Fine-grained entity disambiguation through numeric pattern awareness in transformer models
title_fullStr	Fine-grained entity disambiguation through numeric pattern awareness in transformer models
title_full_unstemmed	Fine-grained entity disambiguation through numeric pattern awareness in transformer models
title_short	Fine-grained entity disambiguation through numeric pattern awareness in transformer models
title_sort	fine grained entity disambiguation through numeric pattern awareness in transformer models
topic	Entity Linking Knowledge Base Question Answering Instance-level Disambiguation Numerical Information Recognition Multi-task learning
url	https://doi.org/10.1007/s40747-025-01936-3
work_keys_str_mv	AT jaeeunjang finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT sangminkim finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT howonmoon finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT sangheonshin finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT mirayun finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT charleswiseman finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels

Fine-grained entity disambiguation through numeric pattern awareness in transformer models

Similar Items