Generating Parathyroid Reports Using YOLO-Based Large Language Models

Different tissues and structures in ultrasound produce varying sound wave reflections, resulting in black-and-white grayscale images. Radiologists review thousands of medical images daily and write detailed reports. In parathyroid ultrasound imaging, their reports must go beyond routine observations...

Full description

Saved in:
Bibliographic Details
Main Authors: Chuan-Yu Chang, Abida Khanum, Chiao-Yin Sun, Ying-Ting Chen, Yu-Chen Tsai, Chih-Chin Hsu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11112831/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Different tissues and structures in ultrasound produce varying sound wave reflections, resulting in black-and-white grayscale images. Radiologists review thousands of medical images daily and write detailed reports. In parathyroid ultrasound imaging, their reports must go beyond routine observations, providing a comprehensive analysis highlighting the lesion’s type, size, and exact location. This critical information forms the foundation for accurate diagnoses and personalized treatment plans, underscoring the vital role of radiologists in modern medical care. This study presents a novel system for generating automated medical reports from parathyroid ultrasound images focusing on critical diagnostic parameters such as size, type, structure, location, homogeneity, and echogenicity through effective lesion recognition and localization. The proposed system, medical report generation for parathyroid ultrasound (MrGPU-GPT), integrates advanced image processing and natural language generation to streamline clinical workflows and improve diagnostic accuracy. The MrGPU-GPT employs a modified YOLOv7-based architecture, YOLO-BiFPN, to detect and localize lesions within the ultrasound images. Subsequently, key lesion characteristics, including homogeneity and echogenicity, are extracted and utilized for prompt engineering. These prompts are then fed into a fine-tuned Vicuna language model, which generates detailed and precise medical reports. The system provides annotated ultrasound images with bounding boxes and offers comprehensive-textual descriptions in the form of medical reports, significantly reducing the reporting burden on radiologists. Combining image-based diagnostic insights with advanced natural language processing minimizes errors caused by human factors such as inexperience or fatigue, offering robust support for radiologists. The resulting system improves the efficiency and reliability of clinical workflows, delivering accurate diagnostic information while improving patient care outcomes. The experimental results show that, in classifying the lesion, detecting objects, and recognizing position markers, the system achieved a mAP of 93.32%, 91.04%, and 99.65% in the validation of the K-Fold. In medical report generation, compared to MiniGPT, our method, using cosine similarity, ROUGE, and BLEU metrics, along with physician feedback, can generate medical reports with higher similarity scores. (MrGPU-GPT) was deployed on a local computer, developing a medical report in about 3-5 seconds to identify parathyroid images and a medical report.
ISSN:2169-3536