Vision transformer-based diagnosis of lumbar disc herniation with grad-CAM interpretability in CT imaging

Abstract Background In this study, a computed tomography (CT)-vision transformer (ViT) framework for diagnosing lumbar disc herniation (LDH) was proposed for the first time by taking advantage of the multidirectional advantages of CT and a ViT. Methods The proposed ViT model was trained and validate...

Full description

Saved in:
Bibliographic Details
Main Authors: Qingsong Chu, Xingyu Wang, Hao Lv, Yao Zhou, Ting Jiang
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Musculoskeletal Disorders
Subjects:
Online Access:https://doi.org/10.1186/s12891-025-08602-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background In this study, a computed tomography (CT)-vision transformer (ViT) framework for diagnosing lumbar disc herniation (LDH) was proposed for the first time by taking advantage of the multidirectional advantages of CT and a ViT. Methods The proposed ViT model was trained and validated on a dataset consisting of 983 patients, including 2100 CT images. We compared the performance of the ViT model with that of several convolutional neural networks (CNNs), including ResNet18, ResNet50, LeNet, AlexNet, and VGG16, across two primary tasks: vertebra localization and disc abnormality classification. Results The integration of a ViT with CT imaging allowed the constructed model to capture the complex spatial relationships and global dependencies within scans, outperforming CNN models and achieving accuracies of 97.13% and 93.63% in terms of vertebra localization and disc abnormality classification, respectively. The performance of the model was further validated via gradient-weighted class activation mapping (Grad-CAM), providing interpretable insights into the regions of the CT scans that contributed to the model predictions. Conclusion This study demonstrated the potential of a ViT for diagnosing LDH using CT imaging. The results highlight the promising clinical applications of this approach, particularly for enhancing the diagnostic efficiency and transparency of medical AI systems.
ISSN:1471-2474