Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis
<b>Background/Objectives</b>: In the burgeoning field of medical imaging and Artificial Intelligence (AI), high-quality annotations for training AI-models are crucial. However, there are still only a few large datasets, as segmentation is time-consuming, experts have limited time. This s...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Diagnostics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2075-4418/15/6/777 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850205127146733568 |
|---|---|
| author | Malte Michel Multusch Lasse Hansen Mattias Paul Heinrich Lennart Berkel Axel Saalbach Heinrich Schulz Franz Wegner Joerg Barkhausen Malte Maria Sieren |
| author_facet | Malte Michel Multusch Lasse Hansen Mattias Paul Heinrich Lennart Berkel Axel Saalbach Heinrich Schulz Franz Wegner Joerg Barkhausen Malte Maria Sieren |
| author_sort | Malte Michel Multusch |
| collection | DOAJ |
| description | <b>Background/Objectives</b>: In the burgeoning field of medical imaging and Artificial Intelligence (AI), high-quality annotations for training AI-models are crucial. However, there are still only a few large datasets, as segmentation is time-consuming, experts have limited time. This study investigates how the experience of radiologists affects the quality of annotations. <b>Methods</b>: We randomly collected 53 anonymized chest radiographs. Fifteen readers with varying levels of expertise annotated the anatomical structures of different complexity, pneumonic opacities and central venous catheters (CVC) as examples of pathologies and foreign material. The readers were divided into three groups of five. The groups consisted of medical students (MS), junior professionals (JP) with less than five years of working experience and senior professionals (SP) with more than five years of experience. Each annotation was compared to a gold standard consisting of a consensus annotation of three senior board-certified radiologists. We calculated the Dice coefficient (DSC) and Hausdorff distance (HD) to evaluate annotation quality. Inter- and intrareader variability and time dependencies were investigated using Intraclass Correlation Coefficient (ICC) and Ordinary Least Squares (OLS). <b>Results</b>: Senior professionals generally showed better performance, while medical students had higher variability in their annotations. Significant differences were noted, especially for complex structures (DSC Pneumonic Opacities as mean [standard deviation]: MS: 0.516 [0.246]; SP: 0.631 [0.211]). However, it should be noted that overall deviation and intraclass variance was higher for these structures even for seniors, highlighting the inherent limitations of conventional radiography. Experience showed a positive relationship with annotation quality for VCS and lung but was not a significant factor for other structures. <b>Conclusions</b>: Experience level significantly impacts annotation quality. Senior radiologists provided higher-quality annotations for complex structures, while less experienced readers could still annotate simpler structures with satisfying accuracy. We suggest a mixed-expertise approach, enabling the highly experienced to utilize their knowledge most effectively. With the increase in numbers of examinations, radiology will rely on AI support tools in the future. Therefore, economizing the process of data acquisition and AI-training; for example, by integrating less experienced radiologists, will help to meet the coming challenges. |
| format | Article |
| id | doaj-art-a9999aa0380749ea87ee79a595c82e43 |
| institution | OA Journals |
| issn | 2075-4418 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Diagnostics |
| spelling | doaj-art-a9999aa0380749ea87ee79a595c82e432025-08-20T02:11:09ZengMDPI AGDiagnostics2075-44182025-03-0115677710.3390/diagnostics15060777Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative AnalysisMalte Michel Multusch0Lasse Hansen1Mattias Paul Heinrich2Lennart Berkel3Axel Saalbach4Heinrich Schulz5Franz Wegner6Joerg Barkhausen7Malte Maria Sieren8Department of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyEchoScout GmbH, 23562 Lübeck, GermanyInstitute of Medical Informatics, University of Lübeck, 23538 Lübeck, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyPhilips Innovative Technologies, 22335 Hamburg, GermanyPhilips Innovative Technologies, 22335 Hamburg, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, GermanyDepartment of Radiology and Nuclear Medicine, UKSH, 23538 Lübeck, Germany<b>Background/Objectives</b>: In the burgeoning field of medical imaging and Artificial Intelligence (AI), high-quality annotations for training AI-models are crucial. However, there are still only a few large datasets, as segmentation is time-consuming, experts have limited time. This study investigates how the experience of radiologists affects the quality of annotations. <b>Methods</b>: We randomly collected 53 anonymized chest radiographs. Fifteen readers with varying levels of expertise annotated the anatomical structures of different complexity, pneumonic opacities and central venous catheters (CVC) as examples of pathologies and foreign material. The readers were divided into three groups of five. The groups consisted of medical students (MS), junior professionals (JP) with less than five years of working experience and senior professionals (SP) with more than five years of experience. Each annotation was compared to a gold standard consisting of a consensus annotation of three senior board-certified radiologists. We calculated the Dice coefficient (DSC) and Hausdorff distance (HD) to evaluate annotation quality. Inter- and intrareader variability and time dependencies were investigated using Intraclass Correlation Coefficient (ICC) and Ordinary Least Squares (OLS). <b>Results</b>: Senior professionals generally showed better performance, while medical students had higher variability in their annotations. Significant differences were noted, especially for complex structures (DSC Pneumonic Opacities as mean [standard deviation]: MS: 0.516 [0.246]; SP: 0.631 [0.211]). However, it should be noted that overall deviation and intraclass variance was higher for these structures even for seniors, highlighting the inherent limitations of conventional radiography. Experience showed a positive relationship with annotation quality for VCS and lung but was not a significant factor for other structures. <b>Conclusions</b>: Experience level significantly impacts annotation quality. Senior radiologists provided higher-quality annotations for complex structures, while less experienced readers could still annotate simpler structures with satisfying accuracy. We suggest a mixed-expertise approach, enabling the highly experienced to utilize their knowledge most effectively. With the increase in numbers of examinations, radiology will rely on AI support tools in the future. Therefore, economizing the process of data acquisition and AI-training; for example, by integrating less experienced radiologists, will help to meet the coming challenges.https://www.mdpi.com/2075-4418/15/6/777annotation qualityinterreader comparisonchest radiographAI research |
| spellingShingle | Malte Michel Multusch Lasse Hansen Mattias Paul Heinrich Lennart Berkel Axel Saalbach Heinrich Schulz Franz Wegner Joerg Barkhausen Malte Maria Sieren Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis Diagnostics annotation quality interreader comparison chest radiograph AI research |
| title | Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis |
| title_full | Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis |
| title_fullStr | Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis |
| title_full_unstemmed | Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis |
| title_short | Impact of Radiologist Experience on AI Annotation Quality in Chest Radiographs: A Comparative Analysis |
| title_sort | impact of radiologist experience on ai annotation quality in chest radiographs a comparative analysis |
| topic | annotation quality interreader comparison chest radiograph AI research |
| url | https://www.mdpi.com/2075-4418/15/6/777 |
| work_keys_str_mv | AT maltemichelmultusch impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT lassehansen impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT mattiaspaulheinrich impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT lennartberkel impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT axelsaalbach impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT heinrichschulz impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT franzwegner impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT joergbarkhausen impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis AT maltemariasieren impactofradiologistexperienceonaiannotationqualityinchestradiographsacomparativeanalysis |