An Improved Small Target Segmentation Model Based on Mask Dino
To address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subs...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/4/1832 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850081465690226688 |
|---|---|
| author | Jun Yang Xu Chen Yun Guan Yixuan Hu Gang Ge |
| author_facet | Jun Yang Xu Chen Yun Guan Yixuan Hu Gang Ge |
| author_sort | Jun Yang |
| collection | DOAJ |
| description | To address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subsequently, by adjusting the network structure, we enhance the feature extraction process, which helps the model maintain generalization across different datasets and reduces the risk of overfitting. Lastly, we propose the FFPN module to optimize the pathways for feature fusion and transmission. The improved FPN reduces unnecessary computations, accelerates model inference speed, and integrates multi-scale feature details and high-level semantic information to complement object features, thereby enhancing model segmentation accuracy. Experimental results demonstrate that the improved segmentation model achieves a mean Intersection over Union (mIoU) of 42.15% on the ADE20K dataset for semantic segmentation tasks, representing a 0.96% increase compared to the Mask Dino method. On the CoCo dataset for instance segmentation tasks, with the Swin Transformer backbone, the Mask AP and Box AP are 47.10 and 52.60, respectively, showing improvements of 1% and 1.3% over the Mask Dino method. With the ResNet-50 backbone, the Mask AP and Box AP are 40.00 and 44.10, respectively, with improvements of 0.5% and 0.9% over the Mask Dino method. For the CoCo dataset’s panoptic segmentation tasks, with the Swin Transformer backbone, the PQ is 54.95, showing a 0.4% increase over the Mask Dino method. With the ResNet-50 backbone, the PQ is 46.93, showing a 0.9% increase over the Mask Dino method. These results effectively demonstrate the improved accuracy and precision of Mask Dino in segmenting small objects across various segmentation tasks. |
| format | Article |
| id | doaj-art-3d8b2f5a6d064b4abf42e5951b53a13c |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-3d8b2f5a6d064b4abf42e5951b53a13c2025-08-20T02:44:43ZengMDPI AGApplied Sciences2076-34172025-02-01154183210.3390/app15041832An Improved Small Target Segmentation Model Based on Mask DinoJun Yang0Xu Chen1Yun Guan2Yixuan Hu3Gang Ge4School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Software, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Software, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Software, Jiangxi Agricultural University, Nanchang 330045, ChinaSchool of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, ChinaTo address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subsequently, by adjusting the network structure, we enhance the feature extraction process, which helps the model maintain generalization across different datasets and reduces the risk of overfitting. Lastly, we propose the FFPN module to optimize the pathways for feature fusion and transmission. The improved FPN reduces unnecessary computations, accelerates model inference speed, and integrates multi-scale feature details and high-level semantic information to complement object features, thereby enhancing model segmentation accuracy. Experimental results demonstrate that the improved segmentation model achieves a mean Intersection over Union (mIoU) of 42.15% on the ADE20K dataset for semantic segmentation tasks, representing a 0.96% increase compared to the Mask Dino method. On the CoCo dataset for instance segmentation tasks, with the Swin Transformer backbone, the Mask AP and Box AP are 47.10 and 52.60, respectively, showing improvements of 1% and 1.3% over the Mask Dino method. With the ResNet-50 backbone, the Mask AP and Box AP are 40.00 and 44.10, respectively, with improvements of 0.5% and 0.9% over the Mask Dino method. For the CoCo dataset’s panoptic segmentation tasks, with the Swin Transformer backbone, the PQ is 54.95, showing a 0.4% increase over the Mask Dino method. With the ResNet-50 backbone, the PQ is 46.93, showing a 0.9% increase over the Mask Dino method. These results effectively demonstrate the improved accuracy and precision of Mask Dino in segmenting small objects across various segmentation tasks.https://www.mdpi.com/2076-3417/15/4/1832deep learningsmall targetsMask DinoSwin Transformermulti-scale |
| spellingShingle | Jun Yang Xu Chen Yun Guan Yixuan Hu Gang Ge An Improved Small Target Segmentation Model Based on Mask Dino Applied Sciences deep learning small targets Mask Dino Swin Transformer multi-scale |
| title | An Improved Small Target Segmentation Model Based on Mask Dino |
| title_full | An Improved Small Target Segmentation Model Based on Mask Dino |
| title_fullStr | An Improved Small Target Segmentation Model Based on Mask Dino |
| title_full_unstemmed | An Improved Small Target Segmentation Model Based on Mask Dino |
| title_short | An Improved Small Target Segmentation Model Based on Mask Dino |
| title_sort | improved small target segmentation model based on mask dino |
| topic | deep learning small targets Mask Dino Swin Transformer multi-scale |
| url | https://www.mdpi.com/2076-3417/15/4/1832 |
| work_keys_str_mv | AT junyang animprovedsmalltargetsegmentationmodelbasedonmaskdino AT xuchen animprovedsmalltargetsegmentationmodelbasedonmaskdino AT yunguan animprovedsmalltargetsegmentationmodelbasedonmaskdino AT yixuanhu animprovedsmalltargetsegmentationmodelbasedonmaskdino AT gangge animprovedsmalltargetsegmentationmodelbasedonmaskdino AT junyang improvedsmalltargetsegmentationmodelbasedonmaskdino AT xuchen improvedsmalltargetsegmentationmodelbasedonmaskdino AT yunguan improvedsmalltargetsegmentationmodelbasedonmaskdino AT yixuanhu improvedsmalltargetsegmentationmodelbasedonmaskdino AT gangge improvedsmalltargetsegmentationmodelbasedonmaskdino |