Pretraining instance segmentation models with bounding box annotations

Annotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not...

Full description

Saved in:
Bibliographic Details
Main Authors: Cathaoir Agnew, Eoin M. Grua, Pepijn Van de Ven, Patrick Denny, Ciarán Eising, Anthony Scanlan
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Intelligent Systems with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667305324001285
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850244820865384448
author Cathaoir Agnew
Eoin M. Grua
Pepijn Van de Ven
Patrick Denny
Ciarán Eising
Anthony Scanlan
author_facet Cathaoir Agnew
Eoin M. Grua
Pepijn Van de Ven
Patrick Denny
Ciarán Eising
Anthony Scanlan
author_sort Cathaoir Agnew
collection DOAJ
description Annotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not suitable for instance segmentation. This work utilizes ground truth bounding boxes to define coarsely annotated polygon masks, which we refer to as weak annotations, on which the models are pre-trained. We investigate the effect of pretraining on data with weak annotations and further fine-tuning on data with strong annotations, that is, finely annotated polygon masks for instance segmentation. The COCO 2017 detection dataset along with 3 model architectures, SOLOv2, Mask-RCNN, and Mask2former, were used to conduct experiments investigating the effect of pretraining on weak annotations. The Cityscapes and Pascal VOC 2012 datasets were used to validate this approach. The empirical results suggest two key outcomes from this investigation. Firstly, a sequential approach to annotating large-scale instance segmentation datasets would be beneficial, enabling higher-performance models in faster timeframes. This is accomplished by first labeling bounding boxes on your data followed by polygon masks. Secondly, it is possible to leverage object detection datasets for pretraining instance segmentation models while maintaining competitive results in the downstream task. This is reflected with 97.5%, 100.4% & 101.3% of the fully supervised performance being achieved with just 1%, 5% & 10% of the instance segmentation annotations of the COCO training dataset being utilized for the best performing model, Mask2former with a Swin-L backbone.
format Article
id doaj-art-2dce2a4e8cb947569cc6144352c8d208
institution OA Journals
issn 2667-3053
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Intelligent Systems with Applications
spelling doaj-art-2dce2a4e8cb947569cc6144352c8d2082025-08-20T01:59:39ZengElsevierIntelligent Systems with Applications2667-30532024-12-012420045410.1016/j.iswa.2024.200454Pretraining instance segmentation models with bounding box annotationsCathaoir Agnew0Eoin M. Grua1Pepijn Van de Ven2Patrick Denny3Ciarán Eising4Anthony Scanlan5Data-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, Ireland; Corresponding author.Data-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Computer Science & Information Systems, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandAnnotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not suitable for instance segmentation. This work utilizes ground truth bounding boxes to define coarsely annotated polygon masks, which we refer to as weak annotations, on which the models are pre-trained. We investigate the effect of pretraining on data with weak annotations and further fine-tuning on data with strong annotations, that is, finely annotated polygon masks for instance segmentation. The COCO 2017 detection dataset along with 3 model architectures, SOLOv2, Mask-RCNN, and Mask2former, were used to conduct experiments investigating the effect of pretraining on weak annotations. The Cityscapes and Pascal VOC 2012 datasets were used to validate this approach. The empirical results suggest two key outcomes from this investigation. Firstly, a sequential approach to annotating large-scale instance segmentation datasets would be beneficial, enabling higher-performance models in faster timeframes. This is accomplished by first labeling bounding boxes on your data followed by polygon masks. Secondly, it is possible to leverage object detection datasets for pretraining instance segmentation models while maintaining competitive results in the downstream task. This is reflected with 97.5%, 100.4% & 101.3% of the fully supervised performance being achieved with just 1%, 5% & 10% of the instance segmentation annotations of the COCO training dataset being utilized for the best performing model, Mask2former with a Swin-L backbone.http://www.sciencedirect.com/science/article/pii/S2667305324001285Computer visionInstance segmentationObject detectionSupervised learningWeak annotations
spellingShingle Cathaoir Agnew
Eoin M. Grua
Pepijn Van de Ven
Patrick Denny
Ciarán Eising
Anthony Scanlan
Pretraining instance segmentation models with bounding box annotations
Intelligent Systems with Applications
Computer vision
Instance segmentation
Object detection
Supervised learning
Weak annotations
title Pretraining instance segmentation models with bounding box annotations
title_full Pretraining instance segmentation models with bounding box annotations
title_fullStr Pretraining instance segmentation models with bounding box annotations
title_full_unstemmed Pretraining instance segmentation models with bounding box annotations
title_short Pretraining instance segmentation models with bounding box annotations
title_sort pretraining instance segmentation models with bounding box annotations
topic Computer vision
Instance segmentation
Object detection
Supervised learning
Weak annotations
url http://www.sciencedirect.com/science/article/pii/S2667305324001285
work_keys_str_mv AT cathaoiragnew pretraininginstancesegmentationmodelswithboundingboxannotations
AT eoinmgrua pretraininginstancesegmentationmodelswithboundingboxannotations
AT pepijnvandeven pretraininginstancesegmentationmodelswithboundingboxannotations
AT patrickdenny pretraininginstancesegmentationmodelswithboundingboxannotations
AT ciaraneising pretraininginstancesegmentationmodelswithboundingboxannotations
AT anthonyscanlan pretraininginstancesegmentationmodelswithboundingboxannotations