Deep learning model for gastrointestinal polyp segmentation
One of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity a...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
PeerJ Inc.
2025-05-01
|
| Series: | PeerJ Computer Science |
| Subjects: | |
| Online Access: | https://peerj.com/articles/cs-2924.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849689520865280000 |
|---|---|
| author | Zitong Wang Zeyi Wang Pengyu Sun |
| author_facet | Zitong Wang Zeyi Wang Pengyu Sun |
| author_sort | Zitong Wang |
| collection | DOAJ |
| description | One of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity and variability of readers’ interpretations of polyps. In this work, we introduce a novel deep learning architecture for gastrointestinal polyp segmentation in the Kvasir-SEG dataset. Our method employs an encoder-decoder structure with a pre-trained ConvNeXt model as the encoder to learn multi-scale feature representations. The feature maps are passed through a ConvNeXt Block and then through a decoder network consisting of three decoder blocks. Our key contribution is the employment of a cross-attention mechanism that creates shortcut connections between the decoder and encoder to maximize feature retention and reduce information loss. In addition, we introduce a Residual Transformer Block in the decoder that learns long-term dependency by using self-attention mechanisms and enhance feature representations. We evaluate our model on the Kvasir-SEG dataset, achieving a Dice coefficient of 0.8715 and mean intersection over union (mIoU) of 0.8021. Our methodology demonstrates state-of-the-art performance in gastrointestinal polyp segmentation and its feasibility of being used as part of clinical pipelines to assist with automated detection and diagnosis of polyps. |
| format | Article |
| id | doaj-art-cf53bd85157947299fd7b5434f2bef28 |
| institution | DOAJ |
| issn | 2376-5992 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | PeerJ Inc. |
| record_format | Article |
| series | PeerJ Computer Science |
| spelling | doaj-art-cf53bd85157947299fd7b5434f2bef282025-08-20T03:21:35ZengPeerJ Inc.PeerJ Computer Science2376-59922025-05-0111e292410.7717/peerj-cs.2924Deep learning model for gastrointestinal polyp segmentationZitong Wang0Zeyi Wang1Pengyu Sun2Imperial College London, London, South Kensington, United KingdomQueen Mary University of London, London, Bethnal Green, United KingdomXidian University, Xi’an, Shannxi, ChinaOne of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity and variability of readers’ interpretations of polyps. In this work, we introduce a novel deep learning architecture for gastrointestinal polyp segmentation in the Kvasir-SEG dataset. Our method employs an encoder-decoder structure with a pre-trained ConvNeXt model as the encoder to learn multi-scale feature representations. The feature maps are passed through a ConvNeXt Block and then through a decoder network consisting of three decoder blocks. Our key contribution is the employment of a cross-attention mechanism that creates shortcut connections between the decoder and encoder to maximize feature retention and reduce information loss. In addition, we introduce a Residual Transformer Block in the decoder that learns long-term dependency by using self-attention mechanisms and enhance feature representations. We evaluate our model on the Kvasir-SEG dataset, achieving a Dice coefficient of 0.8715 and mean intersection over union (mIoU) of 0.8021. Our methodology demonstrates state-of-the-art performance in gastrointestinal polyp segmentation and its feasibility of being used as part of clinical pipelines to assist with automated detection and diagnosis of polyps.https://peerj.com/articles/cs-2924.pdfGastrointestinal polypImage segmentationDeep learningKvasir-SEGTransformer |
| spellingShingle | Zitong Wang Zeyi Wang Pengyu Sun Deep learning model for gastrointestinal polyp segmentation PeerJ Computer Science Gastrointestinal polyp Image segmentation Deep learning Kvasir-SEG Transformer |
| title | Deep learning model for gastrointestinal polyp segmentation |
| title_full | Deep learning model for gastrointestinal polyp segmentation |
| title_fullStr | Deep learning model for gastrointestinal polyp segmentation |
| title_full_unstemmed | Deep learning model for gastrointestinal polyp segmentation |
| title_short | Deep learning model for gastrointestinal polyp segmentation |
| title_sort | deep learning model for gastrointestinal polyp segmentation |
| topic | Gastrointestinal polyp Image segmentation Deep learning Kvasir-SEG Transformer |
| url | https://peerj.com/articles/cs-2924.pdf |
| work_keys_str_mv | AT zitongwang deeplearningmodelforgastrointestinalpolypsegmentation AT zeyiwang deeplearningmodelforgastrointestinalpolypsegmentation AT pengyusun deeplearningmodelforgastrointestinalpolypsegmentation |