Toward general object search in open reality
Abstract Real-world scenarios are inherently dynamic and open-ended, necessitating that current deep models adapt to general objects in open realities to be practically useful. In this paper, we extend a valuable computer vision task called General Object Search in Open Reality (GOSO). The main obje...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-97251-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Real-world scenarios are inherently dynamic and open-ended, necessitating that current deep models adapt to general objects in open realities to be practically useful. In this paper, we extend a valuable computer vision task called General Object Search in Open Reality (GOSO). The main objective of GOSO is to determine whether an object from the open world appears in another gallery image, even when composed of arbitrary entities and backgrounds. However, two significant challenges arise: the high scale variance among different instances of the same entity and the vast openness with an ever-expanding set of unknown categories in the open world. To address these issues, we formalize the GOSO problem and propose a simple yet effective architecture named Siamese Exchanged Attention Network (SEA-Net). Specifically, based on a standard siamese structure, SEA-Net introduces a novel branch that comprises multiple stage-stacked Siamese Exchanged Attention (SEA) layers followed by a Hierarchical Feature Fusion (HFF) module, enabling efficient scale adaptation and the extraction of matching-friendly deep features. Moreover, an Open Score Fusion (OSF) module is integrated into SEA-Net during inference to yield a more robust matching score in open-world scenarios. We construct two new evaluation benchmarks suitable for the GOSO task using the existing COCO and LVIS datasets, and extensive experiments consistently demonstrate the effectiveness of the proposed method. |
|---|---|
| ISSN: | 2045-2322 |