Deep learning model for gastrointestinal polyp segmentation

One of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity a...

Full description

Saved in:
Bibliographic Details
Main Authors: Zitong Wang, Zeyi Wang, Pengyu Sun
Format: Article
Language:English
Published: PeerJ Inc. 2025-05-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-2924.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849689520865280000
author Zitong Wang
Zeyi Wang
Pengyu Sun
author_facet Zitong Wang
Zeyi Wang
Pengyu Sun
author_sort Zitong Wang
collection DOAJ
description One of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity and variability of readers’ interpretations of polyps. In this work, we introduce a novel deep learning architecture for gastrointestinal polyp segmentation in the Kvasir-SEG dataset. Our method employs an encoder-decoder structure with a pre-trained ConvNeXt model as the encoder to learn multi-scale feature representations. The feature maps are passed through a ConvNeXt Block and then through a decoder network consisting of three decoder blocks. Our key contribution is the employment of a cross-attention mechanism that creates shortcut connections between the decoder and encoder to maximize feature retention and reduce information loss. In addition, we introduce a Residual Transformer Block in the decoder that learns long-term dependency by using self-attention mechanisms and enhance feature representations. We evaluate our model on the Kvasir-SEG dataset, achieving a Dice coefficient of 0.8715 and mean intersection over union (mIoU) of 0.8021. Our methodology demonstrates state-of-the-art performance in gastrointestinal polyp segmentation and its feasibility of being used as part of clinical pipelines to assist with automated detection and diagnosis of polyps.
format Article
id doaj-art-cf53bd85157947299fd7b5434f2bef28
institution DOAJ
issn 2376-5992
language English
publishDate 2025-05-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj-art-cf53bd85157947299fd7b5434f2bef282025-08-20T03:21:35ZengPeerJ Inc.PeerJ Computer Science2376-59922025-05-0111e292410.7717/peerj-cs.2924Deep learning model for gastrointestinal polyp segmentationZitong Wang0Zeyi Wang1Pengyu Sun2Imperial College London, London, South Kensington, United KingdomQueen Mary University of London, London, Bethnal Green, United KingdomXidian University, Xi’an, Shannxi, ChinaOne of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity and variability of readers’ interpretations of polyps. In this work, we introduce a novel deep learning architecture for gastrointestinal polyp segmentation in the Kvasir-SEG dataset. Our method employs an encoder-decoder structure with a pre-trained ConvNeXt model as the encoder to learn multi-scale feature representations. The feature maps are passed through a ConvNeXt Block and then through a decoder network consisting of three decoder blocks. Our key contribution is the employment of a cross-attention mechanism that creates shortcut connections between the decoder and encoder to maximize feature retention and reduce information loss. In addition, we introduce a Residual Transformer Block in the decoder that learns long-term dependency by using self-attention mechanisms and enhance feature representations. We evaluate our model on the Kvasir-SEG dataset, achieving a Dice coefficient of 0.8715 and mean intersection over union (mIoU) of 0.8021. Our methodology demonstrates state-of-the-art performance in gastrointestinal polyp segmentation and its feasibility of being used as part of clinical pipelines to assist with automated detection and diagnosis of polyps.https://peerj.com/articles/cs-2924.pdfGastrointestinal polypImage segmentationDeep learningKvasir-SEGTransformer
spellingShingle Zitong Wang
Zeyi Wang
Pengyu Sun
Deep learning model for gastrointestinal polyp segmentation
PeerJ Computer Science
Gastrointestinal polyp
Image segmentation
Deep learning
Kvasir-SEG
Transformer
title Deep learning model for gastrointestinal polyp segmentation
title_full Deep learning model for gastrointestinal polyp segmentation
title_fullStr Deep learning model for gastrointestinal polyp segmentation
title_full_unstemmed Deep learning model for gastrointestinal polyp segmentation
title_short Deep learning model for gastrointestinal polyp segmentation
title_sort deep learning model for gastrointestinal polyp segmentation
topic Gastrointestinal polyp
Image segmentation
Deep learning
Kvasir-SEG
Transformer
url https://peerj.com/articles/cs-2924.pdf
work_keys_str_mv AT zitongwang deeplearningmodelforgastrointestinalpolypsegmentation
AT zeyiwang deeplearningmodelforgastrointestinalpolypsegmentation
AT pengyusun deeplearningmodelforgastrointestinalpolypsegmentation