Enhancing Fracture Detection in Remote Settings: Evaluating the Efficacy of FIXUS AI Deep Learning Algorithms in Identifying Fifth Metatarsal Fractures Using Mixed-Quality X-rays

Category: Midfoot/Forefoot; Trauma Introduction/Purpose: The diagnosis of fractures can be challenging in specific medical settings due to limited expertise or time. While deep learning has shown promising results, its use is confined to the quality of images and the hassle of importing images to th...

Full description

Saved in:
Bibliographic Details
Main Authors: Atta Taseh MD, Alireza Gholipour PhD, Mani Eftekhari, Alireza Ebrahimi MD, Alexandra F. Flaherty MD, MS, Sumner Jones, Varun Nukala, Gregory R. Waryasz MD, Daniel Guss MD, MBA, John Y. Kwon MD, Christopher W. DiGiovanni MD, Lorena Bejarano-Pineda MD, Soheil Ashkani-Esfahani MD
Format: Article
Language:English
Published: SAGE Publishing 2024-12-01
Series:Foot & Ankle Orthopaedics
Online Access:https://doi.org/10.1177/2473011424S00269
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Category: Midfoot/Forefoot; Trauma Introduction/Purpose: The diagnosis of fractures can be challenging in specific medical settings due to limited expertise or time. While deep learning has shown promising results, its use is confined to the quality of images and the hassle of importing images to the models. This study aims to develop a model to detect fifth metatarsal fractures based on cell phone photos of radiographs directly taken from a regular screen. Methods: retrospective case-control study was conducted including patients aged > 18 years with fifth metatarsal fractures (n=1240) (Fx), and healthy controls (n=1224) (NoF). Three view radiographs (anterior, posterior, and lateral) were obtained from the Electronic Health Record (EHR) in PNG format. To generate a mixed-quality dataset, Android and iOS smartphones (SP) were used to create two separate datasets for each Fx and NoF group. Two separate deep learning models on each EHR, SP, and combined datasets were developed using Inception V3 architecture (Figure 1.). The models were also tested on a separate SP dataset (SP-test) that was not included in the development process. Area Under the Receiver Operating Characteristics Curve (AUC) along with other performance metrics were calculated and reported. Continuous data were presented as median (interquartile range), and a p-value of < 0.05 was considered to be significant. Results: Baseline analysis revealed differences between the groups with a median age of 56 years (36-68) for the Fx group, and 62 years (51-72) for the NoF group (p < 0.001). Similarly, the racial composition of the groups was also different (Fx: 84.8% white; NoF: 92.2% white; p < 0.001). Initially, the SP model showed the best performance (Youden Index (YI): 0.92, AUC: 0.99) followed by the EHR (YI: 0.74, AUC: 0.96), and combined (YI: 0.52, AUC: 0.97) models. When tested on the SP-test dataset the EHR model’s performance dropped markedly showing a YI of 0.33, an AUC of 0.78, and a sensitivity of 0.49 (Table 1.). However, the SP and combined models continued to perform optimally (YI: 0.94, AUC: 0.99; YI: 0.78, AUC: 0.98, respectively). Conclusion: This study highlights the crucial role of image quality in developing deep learning models for detecting fifth metatarsal fractures. Our findings demonstrate a markedly reduced performance of the EHR model in identifying fractures within lower-quality images. This emphasizes the need for training algorithms on images of varying quality to create more generalizable models capable of operating effectively across diverse settings.
ISSN:2473-0114