Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis

<b>Background/Objectives</b>: In the burgeoning field of medical imaging and Artificial Intelligence (AI), high-quality annotations for training AI-models are crucial. However, there are still only a few large datasets, as segmentation is time-consuming, experts have limited time. This s...

Full description

Saved in:

Bibliographic Details
Main Authors:	Malte Michel Multusch, Lasse Hansen, Mattias Paul Heinrich, Lennart Berkel, Axel Saalbach, Heinrich Schulz, Franz Wegner, Joerg Barkhausen, Malte Maria Sieren
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Diagnostics
Subjects:	annotation quality interreader comparison chest radiograph AI research
Online Access:	https://www.mdpi.com/2075-4418/15/6/777
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850205127146733568
author	Malte Michel Multusch Lasse Hansen Mattias Paul Heinrich Lennart Berkel Axel Saalbach Heinrich Schulz Franz Wegner Joerg Barkhausen Malte Maria Sieren
author_facet	Malte Michel Multusch Lasse Hansen Mattias Paul Heinrich Lennart Berkel Axel Saalbach Heinrich Schulz Franz Wegner Joerg Barkhausen Malte Maria Sieren
author_sort	Malte Michel Multusch
collection	DOAJ
description	<b>Background/Objectives</b>: In the burgeoning field of medical imaging and Artificial Intelligence (AI), high-quality annotations for training AI-models are crucial. However, there are still only a few large datasets, as segmentation is time-consuming, experts have limited time. This study investigates how the experience of radiologists affects the quality of annotations. <b>Methods</b>: We randomly collected 53 anonymized chest radiographs. Fifteen readers with varying levels of expertise annotated the anatomical structures of different complexity, pneumonic opacities and central venous catheters (CVC) as examples of pathologies and foreign material. The readers were divided into three groups of five. The groups consisted of medical students (MS), junior professionals (JP) with less than five years of working experience and senior professionals (SP) with more than five years of experience. Each annotation was compared to a gold standard consisting of a consensus annotation of three senior board-certified radiologists. We calculated the Dice coefficient (DSC) and Hausdorff distance (HD) to evaluate annotation quality. Inter- and intrareader variability and time dependencies were investigated using Intraclass Correlation Coefficient (ICC) and Ordinary Least Squares (OLS). <b>Results</b>: Senior professionals generally showed better performance, while medical students had higher variability in their annotations. Significant differences were noted, especially for complex structures (DSC Pneumonic Opacities as mean [standard deviation]: MS: 0.516 [0.246]; SP: 0.631 [0.211]). However, it should be noted that overall deviation and intraclass variance was higher for these structures even for seniors, highlighting the inherent limitations of conventional radiography. Experience showed a positive relationship with annotation quality for VCS and lung but was not a significant factor for other structures. <b>Conclusions</b>: Experience level significantly impacts annotation quality. Senior radiologists provided higher-quality annotations for complex structures, while less experienced readers could still annotate simpler structures with satisfying accuracy. We suggest a mixed-expertise approach, enabling the highly experienced to utilize their knowledge most effectively. With the increase in numbers of examinations, radiology will rely on AI support tools in the future. Therefore, economizing the process of data acquisition and AI-training; for example, by integrating less experienced radiologists, will help to meet the coming challenges.
format	Article
id	doaj-art-a9999aa0380749ea87ee79a595c82e43
institution	OA Journals
issn	2075-4418
language	English
publishDate	2025-03-01
publisher	MDPI AG
record_format	Article
series	Diagnostics
spelling	doaj-art-a9999aa0380749ea87ee79a595c82e432025-08-20T02:11:09ZengMDPI AGDiagnostics2075-44182025-03-0115677710.3390/diagnostics15060777Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative AnalysisMalte Michel Multusch0Lasse Hansen1Mattias Paul Heinrich2Lennart Berkel3Axel Saalbach4Heinrich Schulz5Franz Wegner6Joerg Barkhausen7Malte Maria Sieren8Department of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyEchoScout GmbH, 23562 Lübeck, GermanyInstitute of Medical Informatics, University of Lübeck, 23538 Lübeck, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyPhilips Innovative Technologies, 22335 Hamburg, GermanyPhilips Innovative Technologies, 22335 Hamburg, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, Germany<b>Background/Objectives</b>: In the burgeoning field of medical imaging and Artificial Intelligence (AI), high-quality annotations for training AI-models are crucial. However, there are still only a few large datasets, as segmentation is time-consuming, experts have limited time. This study investigates how the experience of radiologists affects the quality of annotations. <b>Methods</b>: We randomly collected 53 anonymized chest radiographs. Fifteen readers with varying levels of expertise annotated the anatomical structures of different complexity, pneumonic opacities and central venous catheters (CVC) as examples of pathologies and foreign material. The readers were divided into three groups of five. The groups consisted of medical students (MS), junior professionals (JP) with less than five years of working experience and senior professionals (SP) with more than five years of experience. Each annotation was compared to a gold standard consisting of a consensus annotation of three senior board-certified radiologists. We calculated the Dice coefficient (DSC) and Hausdorff distance (HD) to evaluate annotation quality. Inter- and intrareader variability and time dependencies were investigated using Intraclass Correlation Coefficient (ICC) and Ordinary Least Squares (OLS). <b>Results</b>: Senior professionals generally showed better performance, while medical students had higher variability in their annotations. Significant differences were noted, especially for complex structures (DSC Pneumonic Opacities as mean [standard deviation]: MS: 0.516 [0.246]; SP: 0.631 [0.211]). However, it should be noted that overall deviation and intraclass variance was higher for these structures even for seniors, highlighting the inherent limitations of conventional radiography. Experience showed a positive relationship with annotation quality for VCS and lung but was not a significant factor for other structures. <b>Conclusions</b>: Experience level significantly impacts annotation quality. Senior radiologists provided higher-quality annotations for complex structures, while less experienced readers could still annotate simpler structures with satisfying accuracy. We suggest a mixed-expertise approach, enabling the highly experienced to utilize their knowledge most effectively. With the increase in numbers of examinations, radiology will rely on AI support tools in the future. Therefore, economizing the process of data acquisition and AI-training; for example, by integrating less experienced radiologists, will help to meet the coming challenges.https://www.mdpi.com/2075-4418/15/6/777annotation qualityinterreader comparisonchest radiographAI research
spellingShingle	Malte Michel Multusch Lasse Hansen Mattias Paul Heinrich Lennart Berkel Axel Saalbach Heinrich Schulz Franz Wegner Joerg Barkhausen Malte Maria Sieren Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis Diagnostics annotation quality interreader comparison chest radiograph AI research
title	Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis
title_full	Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis
title_fullStr	Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis
title_full_unstemmed	Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis
title_short	Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis
title_sort	impact of radiologist experience on ai annotation quality in chest radiographs a comparative analysis
topic	annotation quality interreader comparison chest radiograph AI research
url	https://www.mdpi.com/2075-4418/15/6/777
work_keys_str_mv	AT maltemichelmultusch impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT lassehansen impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT mattiaspaulheinrich impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT lennartberkel impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT axelsaalbach impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT heinrichschulz impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT franzwegner impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT joergbarkhausen impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT maltemariasieren impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis

Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis

Similar Items