Advancements in Large-Scale Image and Text Representation Learning: A Comprehensive Review and Outlook
Large-scale image and text representation learning is critical in determining the performance of multimodal tasks involving images and text, such as visual question answering and image captioning. Most existing research on large-scale image and text representation learning relies on Transformer netw...
Saved in:
| Main Authors: | Yang Qin, Shuxue Ding, Huiming Xie |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10883956/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Application of Textual Representation Methods for Clinical Numerical Data in Early Sepsis Diagnosis
by: Zhang Weimin, et al.
Published: (2024-09-01) -
Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions With Multi-Level Representations
by: Jie Jiang, et al.
Published: (2025-01-01) -
PuzText: Self-Supervised Learning of Permuted Texture Representation for Multilingual Text Recognition
by: Minjun Lu, et al.
Published: (2024-01-01) -
Enhancing Weibo Sentiment Analysis With Multi-Modal Learning: Integrating Text and Synthesized Images With Contrastive Learning
by: Chuyang Wang, et al.
Published: (2025-01-01) -
FLFT: A Large-Scale Pre-Training Model Distributed Fine-Tuning Method That Integrates Federated Learning Strategies
by: Yu Tao, et al.
Published: (2025-01-01)