A deep learning aided bone marrow segmentation of quantitative fat MRI for myelofibrosis patients

PurposeTo automate bone marrow segmentation within pelvic bones in quantitative fat MRI of myelofibrosis (MF) patients using deep-learning (DL) U-Net models.MethodsAutomated segmentation of bone marrow (BM) was evaluated for four U-Net models: 2D U-Net, 2D attention U-Net (2D A-U-Net), 3D U-Net and...

Full description

Saved in:
Bibliographic Details
Main Authors: Humera Tariq, Lubomir Hadjiiski, Dariya Malyarenko, Moshe Talpaz, Kristen Pettit, Gary D. Luker, Brian D. Ross, Thomas L. Chenevert
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2025.1498832/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:PurposeTo automate bone marrow segmentation within pelvic bones in quantitative fat MRI of myelofibrosis (MF) patients using deep-learning (DL) U-Net models.MethodsAutomated segmentation of bone marrow (BM) was evaluated for four U-Net models: 2D U-Net, 2D attention U-Net (2D A-U-Net), 3D U-Net and 3D attention U-Net (3D A-U-Net). An experienced annotator performed the delineation on in-phase (IP) pelvic MRI slices to mark the boundaries of BM regions within two pelvic bones: proximal femur and posterior ilium. The dataset comprising volumetric images of 58 MF patients was split into 32 training, 6 validation and 20 test sub-sets. Model performance was assessed using conventional metrics: average Jaccard Index (AJI), average Volume Error (AVE), average Hausdorff Distance (AHD), and average Volume Intersection Ratio (VIR). Iterative model optimization was performed based on maximizing validation sub-set AJI. Wilcoxon’s rank sum test with Bonferroni corrected significance threshold of p<0.003 was used to compare DL segmentation models for test sub-set.Results2D segmentation models performed best for iliac BM with achieved scores of 95-96% for the VIR and 87-88% for AJI agreement with expert annotations on the test set. Similar performance was observed for femoral BM segmentation with slightly better VIR but worse AJI agreement for U-Net (94% and 86%) versus A-U-Net (92% and 87%). 2D models also exhibited lower AVE variability (8-9%) and ilium AHD (16 mm). The 3D segmentation models have shown marginally higher errors (AHD of 19-20 mm for ilium and 10-12% AVE-SD for both bones) and generally lower agreement scores (VIR of 91-93% for ilium and 89-91% for femur with 85-86% AJI).Pairwise comparison across four U-Nets for three metrics (AHD, AJI, AVE) showed that AJI and AHD performance was not significantly different for 3D U-Net versus 3D A-U-Net and for 2D U-Net versus 2D A-U-Net. Except for AVE, for majority of performance metric comparisons 2D versus 3D model differences were significant in both bones (p<0.001).ConclusionAll four tested U-Net models effectively automated BM segmentation in pelvic MRI of MF patients. The 2D A-U-Net was found best overall for BM segmentation in both femur and ilium.
ISSN:2234-943X