Pretraining instance segmentation models with bounding box annotations
Annotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | Intelligent Systems with Applications |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2667305324001285 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850244820865384448 |
|---|---|
| author | Cathaoir Agnew Eoin M. Grua Pepijn Van de Ven Patrick Denny Ciarán Eising Anthony Scanlan |
| author_facet | Cathaoir Agnew Eoin M. Grua Pepijn Van de Ven Patrick Denny Ciarán Eising Anthony Scanlan |
| author_sort | Cathaoir Agnew |
| collection | DOAJ |
| description | Annotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not suitable for instance segmentation. This work utilizes ground truth bounding boxes to define coarsely annotated polygon masks, which we refer to as weak annotations, on which the models are pre-trained. We investigate the effect of pretraining on data with weak annotations and further fine-tuning on data with strong annotations, that is, finely annotated polygon masks for instance segmentation. The COCO 2017 detection dataset along with 3 model architectures, SOLOv2, Mask-RCNN, and Mask2former, were used to conduct experiments investigating the effect of pretraining on weak annotations. The Cityscapes and Pascal VOC 2012 datasets were used to validate this approach. The empirical results suggest two key outcomes from this investigation. Firstly, a sequential approach to annotating large-scale instance segmentation datasets would be beneficial, enabling higher-performance models in faster timeframes. This is accomplished by first labeling bounding boxes on your data followed by polygon masks. Secondly, it is possible to leverage object detection datasets for pretraining instance segmentation models while maintaining competitive results in the downstream task. This is reflected with 97.5%, 100.4% & 101.3% of the fully supervised performance being achieved with just 1%, 5% & 10% of the instance segmentation annotations of the COCO training dataset being utilized for the best performing model, Mask2former with a Swin-L backbone. |
| format | Article |
| id | doaj-art-2dce2a4e8cb947569cc6144352c8d208 |
| institution | OA Journals |
| issn | 2667-3053 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Intelligent Systems with Applications |
| spelling | doaj-art-2dce2a4e8cb947569cc6144352c8d2082025-08-20T01:59:39ZengElsevierIntelligent Systems with Applications2667-30532024-12-012420045410.1016/j.iswa.2024.200454Pretraining instance segmentation models with bounding box annotationsCathaoir Agnew0Eoin M. Grua1Pepijn Van de Ven2Patrick Denny3Ciarán Eising4Anthony Scanlan5Data-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, Ireland; Corresponding author.Data-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Computer Science & Information Systems, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandData-Driven Intelligent Computer Engineering (D2iCE) Group, University of Limerick, Limerick, Ireland; Department Electronic & Computer Engineering, University of Limerick, Limerick, Ireland; CONFIRM Centre for Smart Manufacturing, University of Limerick, Limerick, IrelandAnnotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not suitable for instance segmentation. This work utilizes ground truth bounding boxes to define coarsely annotated polygon masks, which we refer to as weak annotations, on which the models are pre-trained. We investigate the effect of pretraining on data with weak annotations and further fine-tuning on data with strong annotations, that is, finely annotated polygon masks for instance segmentation. The COCO 2017 detection dataset along with 3 model architectures, SOLOv2, Mask-RCNN, and Mask2former, were used to conduct experiments investigating the effect of pretraining on weak annotations. The Cityscapes and Pascal VOC 2012 datasets were used to validate this approach. The empirical results suggest two key outcomes from this investigation. Firstly, a sequential approach to annotating large-scale instance segmentation datasets would be beneficial, enabling higher-performance models in faster timeframes. This is accomplished by first labeling bounding boxes on your data followed by polygon masks. Secondly, it is possible to leverage object detection datasets for pretraining instance segmentation models while maintaining competitive results in the downstream task. This is reflected with 97.5%, 100.4% & 101.3% of the fully supervised performance being achieved with just 1%, 5% & 10% of the instance segmentation annotations of the COCO training dataset being utilized for the best performing model, Mask2former with a Swin-L backbone.http://www.sciencedirect.com/science/article/pii/S2667305324001285Computer visionInstance segmentationObject detectionSupervised learningWeak annotations |
| spellingShingle | Cathaoir Agnew Eoin M. Grua Pepijn Van de Ven Patrick Denny Ciarán Eising Anthony Scanlan Pretraining instance segmentation models with bounding box annotations Intelligent Systems with Applications Computer vision Instance segmentation Object detection Supervised learning Weak annotations |
| title | Pretraining instance segmentation models with bounding box annotations |
| title_full | Pretraining instance segmentation models with bounding box annotations |
| title_fullStr | Pretraining instance segmentation models with bounding box annotations |
| title_full_unstemmed | Pretraining instance segmentation models with bounding box annotations |
| title_short | Pretraining instance segmentation models with bounding box annotations |
| title_sort | pretraining instance segmentation models with bounding box annotations |
| topic | Computer vision Instance segmentation Object detection Supervised learning Weak annotations |
| url | http://www.sciencedirect.com/science/article/pii/S2667305324001285 |
| work_keys_str_mv | AT cathaoiragnew pretraininginstancesegmentationmodelswithboundingboxannotations AT eoinmgrua pretraininginstancesegmentationmodelswithboundingboxannotations AT pepijnvandeven pretraininginstancesegmentationmodelswithboundingboxannotations AT patrickdenny pretraininginstancesegmentationmodelswithboundingboxannotations AT ciaraneising pretraininginstancesegmentationmodelswithboundingboxannotations AT anthonyscanlan pretraininginstancesegmentationmodelswithboundingboxannotations |