Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models

<b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar o...

Full description

Saved in:
Bibliographic Details
Main Authors: Fatemeh Shah-Mohammadi, Joseph Finkelstein
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:BioMedInformatics
Subjects:
Online Access:https://www.mdpi.com/2673-7426/4/4/116
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in outcome reporting not only hinders the comparability of clinical trial results but also poses significant challenges in evidence synthesis, meta-analysis, and evidence-based decision-making. <b>Methods</b>: This study investigates variability reduction in outcome measures reporting using rule-based and large language-based models. It aims to mitigate the challenges associated with variability in outcome reporting by comparing these two models. The first approach, which is rule-based, will leverage well-known ontologies, and the second approach exploits sentence-bidirectional encoder representations from transformers (SBERT) to identify semantically similar outcomes along with Generative Pre-training Transformer (GPT) to refine the results. <b>Results</b>: The results show that the relatively low percentages of outcomes are linked to established rule-based ontologies. Analysis of outcomes by word count highlighted the absence of ontological linkage for three-word outcomes, which indicates potential gaps in semantic representation. <b>Conclusions</b>: Employing large language models (LLMs), this study demonstrates its ability to identify similar outcomes, even with more than three words, suggesting a crucial role in outcome harmonization efforts, potentially reducing redundancy and enhancing data interoperability.
ISSN:2673-7426