Fine-grained entity disambiguation through numeric pattern awareness in transformer models
Abstract Knowledge base question answering systems rely on entity linking to connect textual mentions in natural language with corresponding entities in a structured knowledge base. While conventional methods perform well in static or generic scenarios, they often struggle in dynamic contexts that r...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-05-01
|
| Series: | Complex & Intelligent Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s40747-025-01936-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850207338028335104 |
|---|---|
| author | Jaeeun Jang Sangmin Kim Howon Moon Sang Heon Shin Mira Yun Charles Wiseman |
| author_facet | Jaeeun Jang Sangmin Kim Howon Moon Sang Heon Shin Mira Yun Charles Wiseman |
| author_sort | Jaeeun Jang |
| collection | DOAJ |
| description | Abstract Knowledge base question answering systems rely on entity linking to connect textual mentions in natural language with corresponding entities in a structured knowledge base. While conventional methods perform well in static or generic scenarios, they often struggle in dynamic contexts that require fine-grained disambiguation between specific instances. This limitation is particularly evident in domains such as military situation reports, where accurate identification of individual instances is essential for timely and precise information retrieval. This work introduces a novel framework for instance-level entity linking that enables transformer-based language models to distinguish between similarly described but distinct instances of the same entity. The proposed framework integrates three core innovations: (1) a context-aware masking strategy that reduces over-reliance on surface-level entity mentions by encouraging the model to leverage surrounding contextual cues, (2) a multi-task learning component that improves the model’s understanding of relative spatial and temporal information, and (3) a position-aware representation technique that restructures numerical data into token sequences that preserve structural integrity, addressing inherent limitations of subword tokenization in current language models. Experimental evaluations on simulated Korean military reports demonstrate substantial performance gains, particularly in challenging cases involving ambiguous or overlapping entity descriptions. The enhanced model shows a markedly improved ability to differentiate between instances based on subtle temporal and geospatial cues. These advancements extend beyond entity linking, offering broader insights into how transformer-based models can more effectively interpret numerical information. This research establishes a new direction for precise entity disambiguation in dynamic environments and contributes practical methods for enhancing language model performance across critical domains, including medical diagnostics, financial analysis, scientific research, autonomous systems, and numerical reasoning in smaller Large Language Models, where our techniques can potentially bridge the performance gap with larger models without requiring extensive computational resources. |
| format | Article |
| id | doaj-art-e854bc66e53940f8b762983b4a5e615d |
| institution | OA Journals |
| issn | 2199-4536 2198-6053 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Springer |
| record_format | Article |
| series | Complex & Intelligent Systems |
| spelling | doaj-art-e854bc66e53940f8b762983b4a5e615d2025-08-20T02:10:32ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-05-0111712110.1007/s40747-025-01936-3Fine-grained entity disambiguation through numeric pattern awareness in transformer modelsJaeeun Jang0Sangmin Kim1Howon Moon2Sang Heon Shin3Mira Yun4Charles Wiseman5AI Research Team, Hanwha SystemsAI Research Team, Hanwha SystemsAI Research Team, Hanwha SystemsAI Research Team, Hanwha SystemsDepartment of Computer Science, Boston CollegeDepartment of Computer Science, Boston CollegeAbstract Knowledge base question answering systems rely on entity linking to connect textual mentions in natural language with corresponding entities in a structured knowledge base. While conventional methods perform well in static or generic scenarios, they often struggle in dynamic contexts that require fine-grained disambiguation between specific instances. This limitation is particularly evident in domains such as military situation reports, where accurate identification of individual instances is essential for timely and precise information retrieval. This work introduces a novel framework for instance-level entity linking that enables transformer-based language models to distinguish between similarly described but distinct instances of the same entity. The proposed framework integrates three core innovations: (1) a context-aware masking strategy that reduces over-reliance on surface-level entity mentions by encouraging the model to leverage surrounding contextual cues, (2) a multi-task learning component that improves the model’s understanding of relative spatial and temporal information, and (3) a position-aware representation technique that restructures numerical data into token sequences that preserve structural integrity, addressing inherent limitations of subword tokenization in current language models. Experimental evaluations on simulated Korean military reports demonstrate substantial performance gains, particularly in challenging cases involving ambiguous or overlapping entity descriptions. The enhanced model shows a markedly improved ability to differentiate between instances based on subtle temporal and geospatial cues. These advancements extend beyond entity linking, offering broader insights into how transformer-based models can more effectively interpret numerical information. This research establishes a new direction for precise entity disambiguation in dynamic environments and contributes practical methods for enhancing language model performance across critical domains, including medical diagnostics, financial analysis, scientific research, autonomous systems, and numerical reasoning in smaller Large Language Models, where our techniques can potentially bridge the performance gap with larger models without requiring extensive computational resources.https://doi.org/10.1007/s40747-025-01936-3Entity LinkingKnowledge Base Question AnsweringInstance-level DisambiguationNumerical Information RecognitionMulti-task learning |
| spellingShingle | Jaeeun Jang Sangmin Kim Howon Moon Sang Heon Shin Mira Yun Charles Wiseman Fine-grained entity disambiguation through numeric pattern awareness in transformer models Complex & Intelligent Systems Entity Linking Knowledge Base Question Answering Instance-level Disambiguation Numerical Information Recognition Multi-task learning |
| title | Fine-grained entity disambiguation through numeric pattern awareness in transformer models |
| title_full | Fine-grained entity disambiguation through numeric pattern awareness in transformer models |
| title_fullStr | Fine-grained entity disambiguation through numeric pattern awareness in transformer models |
| title_full_unstemmed | Fine-grained entity disambiguation through numeric pattern awareness in transformer models |
| title_short | Fine-grained entity disambiguation through numeric pattern awareness in transformer models |
| title_sort | fine grained entity disambiguation through numeric pattern awareness in transformer models |
| topic | Entity Linking Knowledge Base Question Answering Instance-level Disambiguation Numerical Information Recognition Multi-task learning |
| url | https://doi.org/10.1007/s40747-025-01936-3 |
| work_keys_str_mv | AT jaeeunjang finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT sangminkim finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT howonmoon finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT sangheonshin finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT mirayun finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels AT charleswiseman finegrainedentitydisambiguationthroughnumericpatternawarenessintransformermodels |