A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning

To address the issues of high computational complexity and insufficient aggregation of global and local information in existing image segmentation methods, this paper proposes an efficient segmentation model based on Spatial Context Learning, named SCLSeg. The main idea is to aggregate local regions...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaomei Xiao, Jialiang Tang, Xiaoyan Lu, Zhengyong Feng, Yi Li
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10759633/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841533424380149760
author Xiaomei Xiao
Jialiang Tang
Xiaoyan Lu
Zhengyong Feng
Yi Li
author_facet Xiaomei Xiao
Jialiang Tang
Xiaoyan Lu
Zhengyong Feng
Yi Li
author_sort Xiaomei Xiao
collection DOAJ
description To address the issues of high computational complexity and insufficient aggregation of global and local information in existing image segmentation methods, this paper proposes an efficient segmentation model based on Spatial Context Learning, named SCLSeg. The main idea is to aggregate local regions into higher-level semantic regions in a learnable manner. The proposed Spatial Context Guided Feature Alignment module (SC-FA) learns aligned features from image-level to local regions, exploring and integrating contextual information. During training, a multi-scale strategy is used to group semantic regions, and a Channel Aggregation Block (CAB) is designed to dynamically capture semantic groups through a mechanism of feature separation and fusion, thereby aggregating multi-level pixel features to generate the final segmentation results. We further introduce a boundary loss to optimize the accuracy of segmentation edges. To meet real-time processing requirements, a series of lightweight strategies and simplified structures are adopted to reduce computational costs, including lightweight encoding, channel compression, and simplified neck. Our method achieves good performance on the Cityscapes and Camvid datasets, specifically achieving 76.45% mIoU & 237 FPS on the Cityscapes test set, and 73.95% mIoU & 300.4 FPS on the CamVid test set.
format Article
id doaj-art-935feab7933a4dc6ad58840959beb071
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-935feab7933a4dc6ad58840959beb0712025-01-16T00:01:37ZengIEEEIEEE Access2169-35362024-01-011217849517850610.1109/ACCESS.2024.350367610759633A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context LearningXiaomei Xiao0https://orcid.org/0009-0000-0326-4448Jialiang Tang1https://orcid.org/0009-0003-1387-1403Xiaoyan Lu2Zhengyong Feng3https://orcid.org/0009-0005-7291-1684Yi Li4School of Electronic Information Engineering, Electronic Information Processing Engineering Technology Research Center, China West Normal University, Nanchong, ChinaSchool of Electronic Information Engineering, Electronic Information Processing Engineering Technology Research Center, China West Normal University, Nanchong, ChinaSchool of Electronic Information Engineering, Electronic Information Processing Engineering Technology Research Center, China West Normal University, Nanchong, ChinaSchool of Electronic Information Engineering, Electronic Information Processing Engineering Technology Research Center, China West Normal University, Nanchong, ChinaCollege of Physics and Engineering Technology, Chengdu Normal University, Chengdu, ChinaTo address the issues of high computational complexity and insufficient aggregation of global and local information in existing image segmentation methods, this paper proposes an efficient segmentation model based on Spatial Context Learning, named SCLSeg. The main idea is to aggregate local regions into higher-level semantic regions in a learnable manner. The proposed Spatial Context Guided Feature Alignment module (SC-FA) learns aligned features from image-level to local regions, exploring and integrating contextual information. During training, a multi-scale strategy is used to group semantic regions, and a Channel Aggregation Block (CAB) is designed to dynamically capture semantic groups through a mechanism of feature separation and fusion, thereby aggregating multi-level pixel features to generate the final segmentation results. We further introduce a boundary loss to optimize the accuracy of segmentation edges. To meet real-time processing requirements, a series of lightweight strategies and simplified structures are adopted to reduce computational costs, including lightweight encoding, channel compression, and simplified neck. Our method achieves good performance on the Cityscapes and Camvid datasets, specifically achieving 76.45% mIoU & 237 FPS on the Cityscapes test set, and 73.95% mIoU & 300.4 FPS on the CamVid test set.https://ieeexplore.ieee.org/document/10759633/Real-time semantic segmentationspatial context guidancefeature attentionfeature alignment
spellingShingle Xiaomei Xiao
Jialiang Tang
Xiaoyan Lu
Zhengyong Feng
Yi Li
A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning
IEEE Access
Real-time semantic segmentation
spatial context guidance
feature attention
feature alignment
title A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning
title_full A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning
title_fullStr A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning
title_full_unstemmed A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning
title_short A Real-Time Road Scene Semantic Segmentation Model Based on Spatial Context Learning
title_sort real time road scene semantic segmentation model based on spatial context learning
topic Real-time semantic segmentation
spatial context guidance
feature attention
feature alignment
url https://ieeexplore.ieee.org/document/10759633/
work_keys_str_mv AT xiaomeixiao arealtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT jialiangtang arealtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT xiaoyanlu arealtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT zhengyongfeng arealtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT yili arealtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT xiaomeixiao realtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT jialiangtang realtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT xiaoyanlu realtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT zhengyongfeng realtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning
AT yili realtimeroadscenesemanticsegmentationmodelbasedonspatialcontextlearning