Apvit: ViT with adaptive patches for scene text recognition
Abstract Scene texts in nature exhibit varied colors, which serve as a significant distinguishing feature that effectively suppresses background interference. In this study, color clustering is utilized as a prior guide to group patches, enhancing their spatial relationships. Additionally, patch siz...
Saved in:
| Main Authors: | Ning Zhang, Ce Li, Zongshun Wang, Jialin Ma, Zhiqiang Feng |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-03-01
|
| Series: | Discover Applied Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s42452-025-06570-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Interpretable Deep Learning for Diabetic Retinopathy: A Comparative Study of CNN, ViT, and Hybrid Architectures
by: Weijie Zhang, et al.
Published: (2025-05-01) -
A New Pes Planus Automatic Diagnosis Method: ViT-OELM Hybrid Modeling
by: Derya Avcı
Published: (2025-03-01) -
Turkish scene text recognition: Introducing extensive real and synthetic datasets and a novel recognition model
by: Serdar Yıldız
Published: (2024-12-01) -
Tumor ViT-GRU-XAI: Advanced Brain Tumor Diagnosis Framework: Vision Transformer and GRU Integration for Improved MRI Analysis: A Case Study of Egypt
by: Mohammed Aly, et al.
Published: (2024-01-01) -
ViT-RoT: Vision Transformer-Based Robust Framework for Tomato Leaf Disease Recognition
by: Sathiyamohan Nishankar, et al.
Published: (2025-06-01)