Leveraging network uncertainty to identify regions in rectal cancer clinical target volume auto-segmentations likely requiring manual edits

Background and Purpose: While Deep Learning (DL) auto-segmentation has the potential to improve segmentation efficiency in the radiotherapy workflow, manual adjustments of the predictions are still required. Network uncertainty quantification has been proposed as a quality assurance tool to ensure a...

Full description

Saved in:
Bibliographic Details
Main Authors: Federica C. Maruccio, Rita Simões, Joëlle E. van Aalst, Charlotte L. Brouwer, Jan-Jakob Sonke, Peter van Ooijen, Tomas M. Janssen
Format: Article
Language:English
Published: Elsevier 2025-04-01
Series:Physics and Imaging in Radiation Oncology
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405631625000764
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background and Purpose: While Deep Learning (DL) auto-segmentation has the potential to improve segmentation efficiency in the radiotherapy workflow, manual adjustments of the predictions are still required. Network uncertainty quantification has been proposed as a quality assurance tool to ensure an efficient segmentation workflow. However, the interpretation is often complicated due to various sources of uncertainty interacting non-trivially. In this work, we compared network predictions with both independent manual segmentations and manual corrections of the predictions. We assume that manual corrections only address clinically relevant errors and are therefore associated with lower aleatoric uncertainty due to less inter-observer variability. We expect the remaining epistemic uncertainty to be a better predictor of segmentation corrections. Materials and Methods: We considered DL auto-segmentations of the mesorectum clinical target volume. Uncertainty maps of nnU-Net outputs were generated using Monte Carlo dropout. On a global level, we investigated the correlation between mean network uncertainty and network segmentation performance. On a local level, we compared the uncertainty envelope width with the length of the error from both independent contours and corrected predictions. The uncertainty envelope widths were used to classify the error lengths as above or below a predefined threshold. Results: We achieved an AUC above 0.9 in identifying regions manually corrected with edits larger than 8 mm, while the AUC for inconsistencies with the independent contours was significantly lower at approximately 0.7. Conclusions: Our results validate the hypothesis that epistemic uncertainty estimates are a valuable tool to capture regions likely requiring clinically relevant edits.
ISSN:2405-6316