Automated Chest X-Ray Diagnosis Report Generation with Cross-Attention Mechanism

In the medical field, it is extremely important to use deep learning technology to automatically generate diagnostic reports for chest X-ray images. This technology provides an effective solution to the challenges faced by the medical field in processing large numbers of chest X-ray images. Especial...

Full description

Saved in:
Bibliographic Details
Main Authors: Jian Zhao, Wei Yao, Lei Sun, Lijuan Shi, Zhejun Kuang, Changwu Wu, Qiulei Han
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/343
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the medical field, it is extremely important to use deep learning technology to automatically generate diagnostic reports for chest X-ray images. This technology provides an effective solution to the challenges faced by the medical field in processing large numbers of chest X-ray images. Especially during large-scale outbreaks of epidemics such as the new COVID-19, rapid and accurate screening and diagnosis of cases become important tasks. This study uses deep learning technology to automatically generate diagnostic reports for chest X-ray images, which significantly reduces the workload of doctors, reduces the risk of misdiagnosis and missed diagnosis, and provides technical support for improving public health emergency response capabilities. In this study, we propose an innovative network architecture to address the limitations of traditional image description networks in generating chest X-ray diagnostic reports, especially the large area deviation between abnormal and normal areas, and the lack of effective alignment of the two modalities of image and text. The convolutional block attention module (CBAM) is adopted to effectively alleviate the data bias problem through a sophisticated feature attention mechanism and improve the model’s ability to recognize abnormal image areas. The cross-attention mechanism is adopted to optimize the alignment process between images and texts, ensuring the accuracy and reliability of the diagnosis report.
ISSN:2076-3417