Private Data Incrementalization: Data-Centric Model Development for Clinical Liver Segmentation
Machine Learning models, more specifically Artificial Neural Networks, are transforming medical imaging by enabling precise liver segmentation, a crucial task for diagnosing and treating liver diseases. However, these models often face challenges in adapting to diverse clinical data sources as diffe...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Bioengineering |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2306-5354/12/5/530 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Machine Learning models, more specifically Artificial Neural Networks, are transforming medical imaging by enabling precise liver segmentation, a crucial task for diagnosing and treating liver diseases. However, these models often face challenges in adapting to diverse clinical data sources as differences in dataset volume, resolution, and origin impact generalization and performance. This study introduces a <i>Private Data Incrementalization</i>, a data-centric approach to enhance the adaptability of Artificial Neural Networks by progressively exposing them to varied clinical data. As the target of this study is not to propose a new image segmentation model, the existing medical imaging segmentation models—including U-Net, ResUNet++, Fully Convolutional Network, and a modified algorithm based on the Conditional Bernoulli Diffusion Model—are used. The study evaluates these four models using a curated private dataset of computed tomography scans from Coimbra University Hospital, supplemented by two public datasets, 3D-IRCADb01 and CHAOS. The <i>Private Data Incrementalization</i> method systematically increases the volume and diversity of training data, simulating real-world conditions where models must handle varied imaging contexts. Pre-processing and post-processing stages, incremental training, and performance evaluations reveal that structured exposure to diverse datasets improves segmentation performance, with ResUNet++ achieving the highest accuracy (0.9972) and Dice Similarity Coefficient (0.9449), and the best Average Symmetric Surface Distance (0.0053 mm), demonstrating the importance of dataset diversity and volume for segmentation models’ robustness and generalization. <i>Private Data Incrementalization</i> thus offers a scalable strategy for building resilient segmentation models, ultimately benefiting clinical workflows, patient care, and healthcare resource management by addressing the variability inherent in clinical imaging data. |
|---|---|
| ISSN: | 2306-5354 |