GLEM: a global–local enhancement method for fine-grained image recognition with attention erasure and multi-view cropping

Abstract Fine-grained image recognition (FGIR) aims to distinguish between visual objects and their subcategories with subtle differences. Due to the highly similar features between categories in fine-grained image recognition tasks, the model requires more substantial discriminative capability. Exi...

Full description

Saved in:
Bibliographic Details
Main Authors: Chenglong Zhou, Damin Zhang, Qing He, MingFang Li, MingRong Li, Xiaobo Zhou
Format: Article
Language:English
Published: Springer 2025-07-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:https://doi.org/10.1007/s44443-025-00120-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Fine-grained image recognition (FGIR) aims to distinguish between visual objects and their subcategories with subtle differences. Due to the highly similar features between categories in fine-grained image recognition tasks, the model requires more substantial discriminative capability. Existing methods mainly focus on learning prominent visual patterns, often neglecting other potential features, which makes it difficult for the model to fully distinguish subtle differences in both global and local features of objects, thus limiting the performance of FGIR tasks. This work proposes a Global–Local Enhanced Module (GLEM) to integrate global and local features to address these issues effectively. GLEM is based on channel-aware attention mechanisms and explores new feature details through adaptive erasure and dynamic fusion strategies, preventing the model from overly focusing on prominent regions. At the same time, GLEM utilizes multi-view cropping techniques to capture subtle differences between global and local features effectively. We conduct extensive experiments on three FGIR benchmark datasets, and the results demonstrate that the proposed GLEM method achieves state-of-the-art performance.
ISSN:1319-1578
2213-1248