An Improved Small Target Segmentation Model Based on Mask Dino

To address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subs...

Full description

Saved in:
Bibliographic Details
Main Authors: Jun Yang, Xu Chen, Yun Guan, Yixuan Hu, Gang Ge
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/4/1832
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850081465690226688
author Jun Yang
Xu Chen
Yun Guan
Yixuan Hu
Gang Ge
author_facet Jun Yang
Xu Chen
Yun Guan
Yixuan Hu
Gang Ge
author_sort Jun Yang
collection DOAJ
description To address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subsequently, by adjusting the network structure, we enhance the feature extraction process, which helps the model maintain generalization across different datasets and reduces the risk of overfitting. Lastly, we propose the FFPN module to optimize the pathways for feature fusion and transmission. The improved FPN reduces unnecessary computations, accelerates model inference speed, and integrates multi-scale feature details and high-level semantic information to complement object features, thereby enhancing model segmentation accuracy. Experimental results demonstrate that the improved segmentation model achieves a mean Intersection over Union (mIoU) of 42.15% on the ADE20K dataset for semantic segmentation tasks, representing a 0.96% increase compared to the Mask Dino method. On the CoCo dataset for instance segmentation tasks, with the Swin Transformer backbone, the Mask AP and Box AP are 47.10 and 52.60, respectively, showing improvements of 1% and 1.3% over the Mask Dino method. With the ResNet-50 backbone, the Mask AP and Box AP are 40.00 and 44.10, respectively, with improvements of 0.5% and 0.9% over the Mask Dino method. For the CoCo dataset’s panoptic segmentation tasks, with the Swin Transformer backbone, the PQ is 54.95, showing a 0.4% increase over the Mask Dino method. With the ResNet-50 backbone, the PQ is 46.93, showing a 0.9% increase over the Mask Dino method. These results effectively demonstrate the improved accuracy and precision of Mask Dino in segmenting small objects across various segmentation tasks.
format Article
id doaj-art-3d8b2f5a6d064b4abf42e5951b53a13c
institution DOAJ
issn 2076-3417
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-3d8b2f5a6d064b4abf42e5951b53a13c2025-08-20T02:44:43ZengMDPI AGApplied Sciences2076-34172025-02-01154183210.3390/app15041832An Improved Small Target Segmentation Model Based on Mask DinoJun Yang0Xu Chen1Yun Guan2Yixuan Hu3Gang Ge4School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Software, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Software, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Software, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, ChinaTo address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subsequently, by adjusting the network structure, we enhance the feature extraction process, which helps the model maintain generalization across different datasets and reduces the risk of overfitting. Lastly, we propose the FFPN module to optimize the pathways for feature fusion and transmission. The improved FPN reduces unnecessary computations, accelerates model inference speed, and integrates multi-scale feature details and high-level semantic information to complement object features, thereby enhancing model segmentation accuracy. Experimental results demonstrate that the improved segmentation model achieves a mean Intersection over Union (mIoU) of 42.15% on the ADE20K dataset for semantic segmentation tasks, representing a 0.96% increase compared to the Mask Dino method. On the CoCo dataset for instance segmentation tasks, with the Swin Transformer backbone, the Mask AP and Box AP are 47.10 and 52.60, respectively, showing improvements of 1% and 1.3% over the Mask Dino method. With the ResNet-50 backbone, the Mask AP and Box AP are 40.00 and 44.10, respectively, with improvements of 0.5% and 0.9% over the Mask Dino method. For the CoCo dataset’s panoptic segmentation tasks, with the Swin Transformer backbone, the PQ is 54.95, showing a 0.4% increase over the Mask Dino method. With the ResNet-50 backbone, the PQ is 46.93, showing a 0.9% increase over the Mask Dino method. These results effectively demonstrate the improved accuracy and precision of Mask Dino in segmenting small objects across various segmentation tasks.https://www.mdpi.com/2076-3417/15/4/1832deep learningsmall targetsMask DinoSwin Transformermulti-scale
spellingShingle Jun Yang
Xu Chen
Yun Guan
Yixuan Hu
Gang Ge
An Improved Small Target Segmentation Model Based on Mask Dino
Applied Sciences
deep learning
small targets
Mask Dino
Swin Transformer
multi-scale
title An Improved Small Target Segmentation Model Based on Mask Dino
title_full An Improved Small Target Segmentation Model Based on Mask Dino
title_fullStr An Improved Small Target Segmentation Model Based on Mask Dino
title_full_unstemmed An Improved Small Target Segmentation Model Based on Mask Dino
title_short An Improved Small Target Segmentation Model Based on Mask Dino
title_sort improved small target segmentation model based on mask dino
topic deep learning
small targets
Mask Dino
Swin Transformer
multi-scale
url https://www.mdpi.com/2076-3417/15/4/1832
work_keys_str_mv AT junyang animprovedsmalltargetsegmentationmodelbasedonmaskdino
AT xuchen animprovedsmalltargetsegmentationmodelbasedonmaskdino
AT yunguan animprovedsmalltargetsegmentationmodelbasedonmaskdino
AT yixuanhu animprovedsmalltargetsegmentationmodelbasedonmaskdino
AT gangge animprovedsmalltargetsegmentationmodelbasedonmaskdino
AT junyang improvedsmalltargetsegmentationmodelbasedonmaskdino
AT xuchen improvedsmalltargetsegmentationmodelbasedonmaskdino
AT yunguan improvedsmalltargetsegmentationmodelbasedonmaskdino
AT yixuanhu improvedsmalltargetsegmentationmodelbasedonmaskdino
AT gangge improvedsmalltargetsegmentationmodelbasedonmaskdino