Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer

The semantic segmentation of bone structures demands pixel-level classification accuracy to create reliable bone models for diagnosis. While Convolutional Neural Networks (CNNs) are commonly used for segmentation, they often struggle with complex shapes due to their focus on texture features and lim...

Full description

Saved in:

Bibliographic Details
Main Authors:	Naohiro Masuda, Keiko Ono, Daisuke Tawara, Yusuke Matsuura, Kentaro Sakabe
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Sensors
Subjects:	feature pyramid network Mask2Former SegFormer semantic segmentation transformer block
Online Access:	https://www.mdpi.com/1424-8220/25/1/81
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850116875693850624
author	Naohiro Masuda Keiko Ono Daisuke Tawara Yusuke Matsuura Kentaro Sakabe
author_facet	Naohiro Masuda Keiko Ono Daisuke Tawara Yusuke Matsuura Kentaro Sakabe
author_sort	Naohiro Masuda
collection	DOAJ
description	The semantic segmentation of bone structures demands pixel-level classification accuracy to create reliable bone models for diagnosis. While Convolutional Neural Networks (CNNs) are commonly used for segmentation, they often struggle with complex shapes due to their focus on texture features and limited ability to incorporate positional information. As orthopedic surgery increasingly requires precise automatic diagnosis, we explored SegFormer, an enhanced Vision Transformer model that better handles spatial awareness in segmentation tasks. However, SegFormer’s effectiveness is typically limited by its need for extensive training data, which is particularly challenging in medical imaging, where obtaining labeled ground truths (GTs) is a costly and resource-intensive process. In this paper, we propose two models and their combination to enable accurate feature extraction from smaller datasets by improving SegFormer. Specifically, these include the data-efficient model, which deepens the hierarchical encoder by adding convolution layers to transformer blocks and increases feature map resolution within transformer blocks, and the FPN-based model, which enhances the decoder through a Feature Pyramid Network (FPN) and attention mechanisms. Testing our model on spine images from the Cancer Imaging Archive and our own hand and wrist dataset, ablation studies confirmed that our modifications outperform the original SegFormer, U-Net, and Mask2Former. These enhancements enable better image feature extraction and more precise object contour detection, which is particularly beneficial for medical imaging applications with limited training data.
format	Article
id	doaj-art-ca9366701cc74229a2fde5c4dab3c2af
institution	OA Journals
issn	1424-8220
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj-art-ca9366701cc74229a2fde5c4dab3c2af2025-08-20T02:36:12ZengMDPI AGSensors1424-82202024-12-012518110.3390/s25010081Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormerNaohiro Masuda0Keiko Ono1Daisuke Tawara2Yusuke Matsuura3Kentaro Sakabe4Master’s Program in Information and Computer Science, Doshisha University, Kyoto 610-0394, JapanDepartment of Intelligent Information Engineering and Sciences, Doshisha University, Kyoto 610-0394, JapanDepartment of Advanced Science and Technology, Ryukoku University, Kyoto 520-2194, JapanDepartment of Orthopedic Surgery, Chiba University, Chiba 260-8677, JapanMaster’s Program in Information and Computer Science, Doshisha University, Kyoto 610-0394, JapanThe semantic segmentation of bone structures demands pixel-level classification accuracy to create reliable bone models for diagnosis. While Convolutional Neural Networks (CNNs) are commonly used for segmentation, they often struggle with complex shapes due to their focus on texture features and limited ability to incorporate positional information. As orthopedic surgery increasingly requires precise automatic diagnosis, we explored SegFormer, an enhanced Vision Transformer model that better handles spatial awareness in segmentation tasks. However, SegFormer’s effectiveness is typically limited by its need for extensive training data, which is particularly challenging in medical imaging, where obtaining labeled ground truths (GTs) is a costly and resource-intensive process. In this paper, we propose two models and their combination to enable accurate feature extraction from smaller datasets by improving SegFormer. Specifically, these include the data-efficient model, which deepens the hierarchical encoder by adding convolution layers to transformer blocks and increases feature map resolution within transformer blocks, and the FPN-based model, which enhances the decoder through a Feature Pyramid Network (FPN) and attention mechanisms. Testing our model on spine images from the Cancer Imaging Archive and our own hand and wrist dataset, ablation studies confirmed that our modifications outperform the original SegFormer, U-Net, and Mask2Former. These enhancements enable better image feature extraction and more precise object contour detection, which is particularly beneficial for medical imaging applications with limited training data.https://www.mdpi.com/1424-8220/25/1/81feature pyramid networkMask2FormerSegFormersemantic segmentationtransformer block
spellingShingle	Naohiro Masuda Keiko Ono Daisuke Tawara Yusuke Matsuura Kentaro Sakabe Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer Sensors feature pyramid network Mask2Former SegFormer semantic segmentation transformer block
title	Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer
title_full	Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer
title_fullStr	Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer
title_full_unstemmed	Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer
title_short	Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer
title_sort	data efficient bone segmentation using feature pyramid based segformer
topic	feature pyramid network Mask2Former SegFormer semantic segmentation transformer block
url	https://www.mdpi.com/1424-8220/25/1/81
work_keys_str_mv	AT naohiromasuda dataefficientbonesegmentationusingfeaturepyramidbasedsegformer AT keikoono dataefficientbonesegmentationusingfeaturepyramidbasedsegformer AT daisuketawara dataefficientbonesegmentationusingfeaturepyramidbasedsegformer AT yusukematsuura dataefficientbonesegmentationusingfeaturepyramidbasedsegformer AT kentarosakabe dataefficientbonesegmentationusingfeaturepyramidbasedsegformer

Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer

Similar Items