Interobserver and intra-observer variability in bowel preparation scoring for colon capsule endoscopy: impact of AI-assisted assessment, interim analysis

Introduction: Colon capsule endoscopy (CCE) has gained prominence post-coronavirus 2019 (COVID-19) as a non-invasive alternative for lower gastrointestinal investigations. However, bowel cleansing remains challenging, because CCE cannot suction, wash or reposition for better mucosal visualisation. W...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ian Io Lei, Daniel Gaya, Alexander Robertson, Benedicte Schelde-Olesen, Alice Mapiye, Anirudh Bhandare, Ursula Valentiner, Pablo Laiz, Santi Segui, Anastasios Koulaouzidis, Ramesh Arasaradnam
Format:	Article
Language:	English
Published:	Elsevier 2025-07-01
Series:	Clinical Medicine
Online Access:	http://www.sciencedirect.com/science/article/pii/S1470211825001393
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Introduction: Colon capsule endoscopy (CCE) has gained prominence post-coronavirus 2019 (COVID-19) as a non-invasive alternative for lower gastrointestinal investigations. However, bowel cleansing remains challenging, because CCE cannot suction, wash or reposition for better mucosal visualisation. While interobserver variability in bowel preparation scoring is well documented in traditional colonoscopy, its impact on whole-video CCE assessment remains unclear. Objective: This CESCAIL sub-study aimed to evaluate interobserver agreement in CCE bowel cleansing assessment among readers and assess the agreement between AI-assisted and manual assessments and whether AI improves interobserver agreement. Material and Methods: As part of the CESCAIL study, 25 completed videos were randomly selected from 673 CCE recordings. Nine readers with varying levels of CCE experience assessed bowel cleansing quality using the Leighton Rex scale and the Colon Capsule CLEansing Assessment and Report (CC-CLEAR) score. Following a 6-month washout period, the same readers reassessed the videos using AI-assisted analyses through CC-CLEAR score to evaluate improvements in interobserver variability and changes in intra-observer variability between manual and AI-assisted readings. Interobserver variability was assessed using intraclass correlation coefficients (ICC) and bootstrapping with 1,000 iterations with Fleiss Kappa agreement, while intraobserver variability was evaluated using Cohen Kappa agreement and Bland-Altman analysis. Result and Discussion: The Leighton Rex scale showed moderate reliability (ICC=0.55, 95% CI: 0.48–0.63), while CC-CLEAR demonstrated good reliability (ICC=0.89, 95% CI: 0.86–0.92), significantly reducing interobserver variability. Clinician agreement was poor (κ=0.0889), but AI-assisted scoring improved it to moderate levels (κ=0.3419, p=0.0098). The intraobserver agreement between AI-assisted and manual assessment showed moderate to excellent reliability (ICC=0.69–0.90). Cohen’s Kappa analysis revealed good agreement between manual and AI-assisted evaluation among experienced readers (κ=0.67–0.85) but moderate agreement among less experienced readers (κ=0.47–0.61). However, Bland-Altman analysis showed that AI-assisted assessment consistently assigned lower bowel cleansing scores compared with manual reading. Conclusion: Interobserver agreement for CC-CLEAR was good, irrespective of the readers’ experience levels. AI-assisted assessments significantly improved interobserver agreement, providing more consistent and reproducible scoring. The moderate agreement between AI-assisted and manual assessments suggests that further optimisation is needed to enhance alignment with manual scoring while preserving the strong interobserver agreement. Full study results are forthcoming.
ISSN:	1470-2118

Interobserver and intra-observer variability in bowel preparation scoring for colon capsule endoscopy: impact of AI-assisted assessment, interim analysis

Similar Items