Pre‐trained SAM as data augmentation for image segmentation

Abstract Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset. Initially, data augmentation mainly involved some simple transformations of images. Later, in order to increase the diversity and complexity of data, more advanced met...

Full description

Saved in:

Bibliographic Details
Main Authors:	Junjun Wu, Yunbo Rao, Shaoning Zeng, Bob Zhang
Format:	Article
Language:	English
Published:	Wiley 2025-02-01
Series:	CAAI Transactions on Intelligence Technology
Subjects:	data augmentation image segmentation large model segment anything model
Online Access:	https://doi.org/10.1049/cit2.12381
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849324008222228480
author	Junjun Wu Yunbo Rao Shaoning Zeng Bob Zhang
author_facet	Junjun Wu Yunbo Rao Shaoning Zeng Bob Zhang
author_sort	Junjun Wu
collection	DOAJ
description	Abstract Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset. Initially, data augmentation mainly involved some simple transformations of images. Later, in order to increase the diversity and complexity of data, more advanced methods appeared and evolved to sophisticated generative models. However, these methods required a mass of computation of training or searching. In this paper, a novel training‐free method that utilises the Pre‐Trained Segment Anything Model (SAM) model as a data augmentation tool (PTSAM‐DA) is proposed to generate the augmented annotations for images. Without the need for training, it obtains prompt boxes from the original annotations and then feeds the boxes to the pre‐trained SAM to generate diverse and improved annotations. In this way, annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model. Multiple comparative experiments on three datasets are conducted, including an in‐house dataset, ADE20K and COCO2017. On this in‐house dataset, namely Agricultural Plot Segmentation Dataset, maximum improvements of 3.77% and 8.92% are gained in two mainstream metrics, mIoU and mAcc, respectively. Consequently, large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.
format	Article
id	doaj-art-cd82b1dbf933483ea7a85828b01e0669
institution	Kabale University
issn	2468-2322
language	English
publishDate	2025-02-01
publisher	Wiley
record_format	Article
series	CAAI Transactions on Intelligence Technology
spelling	doaj-art-cd82b1dbf933483ea7a85828b01e06692025-08-20T03:48:51ZengWileyCAAI Transactions on Intelligence Technology2468-23222025-02-0110126828210.1049/cit2.12381Pre‐trained SAM as data augmentation for image segmentationJunjun Wu0Yunbo Rao1Shaoning Zeng2Bob Zhang3Yangtze Delta Region Institute (Huzhou) University of Electronic Science and Technology of China Huzhou ChinaSchool of Information and Software Engineering University of Electronic Science and Technology of China Chengdu ChinaYangtze Delta Region Institute (Huzhou) University of Electronic Science and Technology of China Huzhou ChinaPattern Analysis and Machine Intelligence Research Group Department of Computer and Information Science University of Macau Macau ChinaAbstract Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset. Initially, data augmentation mainly involved some simple transformations of images. Later, in order to increase the diversity and complexity of data, more advanced methods appeared and evolved to sophisticated generative models. However, these methods required a mass of computation of training or searching. In this paper, a novel training‐free method that utilises the Pre‐Trained Segment Anything Model (SAM) model as a data augmentation tool (PTSAM‐DA) is proposed to generate the augmented annotations for images. Without the need for training, it obtains prompt boxes from the original annotations and then feeds the boxes to the pre‐trained SAM to generate diverse and improved annotations. In this way, annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model. Multiple comparative experiments on three datasets are conducted, including an in‐house dataset, ADE20K and COCO2017. On this in‐house dataset, namely Agricultural Plot Segmentation Dataset, maximum improvements of 3.77% and 8.92% are gained in two mainstream metrics, mIoU and mAcc, respectively. Consequently, large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.https://doi.org/10.1049/cit2.12381data augmentationimage segmentationlarge modelsegment anything model
spellingShingle	Junjun Wu Yunbo Rao Shaoning Zeng Bob Zhang Pre‐trained SAM as data augmentation for image segmentation CAAI Transactions on Intelligence Technology data augmentation image segmentation large model segment anything model
title	Pre‐trained SAM as data augmentation for image segmentation
title_full	Pre‐trained SAM as data augmentation for image segmentation
title_fullStr	Pre‐trained SAM as data augmentation for image segmentation
title_full_unstemmed	Pre‐trained SAM as data augmentation for image segmentation
title_short	Pre‐trained SAM as data augmentation for image segmentation
title_sort	pre trained sam as data augmentation for image segmentation
topic	data augmentation image segmentation large model segment anything model
url	https://doi.org/10.1049/cit2.12381
work_keys_str_mv	AT junjunwu pretrainedsamasdataaugmentationforimagesegmentation AT yunborao pretrainedsamasdataaugmentationforimagesegmentation AT shaoningzeng pretrainedsamasdataaugmentationforimagesegmentation AT bobzhang pretrainedsamasdataaugmentationforimagesegmentation

Pre‐trained SAM as data augmentation for image segmentation

Similar Items