Advancements in Large-Scale Image and Text Representation Learning: A Comprehensive Review and Outlook

Advancements in Large-Scale Image and Text Representation Learning: A Comprehensive Review and Outlook

Large-scale image and text representation learning is critical in determining the performance of multimodal tasks involving images and text, such as visual question answering and image captioning. Most existing research on large-scale image and text representation learning relies on Transformer netw...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang Qin, Shuxue Ding, Huiming Xie
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Image-and-text large-scale representation learning pre-training transformer self-supervised learning
Online Access:	https://ieeexplore.ieee.org/document/10883956/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Application of Textual Representation Methods for Clinical Numerical Data in Early Sepsis Diagnosis
by: Zhang Weimin, et al.
Published: (2024-09-01)

Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions With Multi-Level Representations
by: Jie Jiang, et al.
Published: (2025-01-01)

PuzText: Self-Supervised Learning of Permuted Texture Representation for Multilingual Text Recognition
by: Minjun Lu, et al.
Published: (2024-01-01)

Enhancing Weibo Sentiment Analysis With Multi-Modal Learning: Integrating Text and Synthesized Images With Contrastive Learning
by: Chuyang Wang, et al.
Published: (2025-01-01)

FLFT: A Large-Scale Pre-Training Model Distributed Fine-Tuning Method That Integrates Federated Learning Strategies
by: Yu Tao, et al.
Published: (2025-01-01)

Benchmarking pre-trained text embedding models in aligning built asset information
by: Mehrzad Shahinmoghadam, et al.
Published: (2025-07-01)

Graph-LLM fusion: enhancing fact representation and logical reasoning in artificial intelligence systems
by: YANG Juan, et al.
Published: (2025-01-01)

Graph-LLM fusion: enhancing fact representation and logical reasoning in artificial intelligence systems
by: YANG Juan, et al.
Published: (2025-01-01)

Learning Self-Supervised Representations of Powder-Diffraction Patterns
by: Shubhayu Das, et al.
Published: (2025-04-01)

TSTBench: A Comprehensive Benchmark for Text Style Transfer
by: Yifei Xie, et al.
Published: (2025-05-01)

The Effect of Various Text Representation Methods for Sentiment Analysis on Movie Review Data with Different Machine Learning Methods
by: Veysel Göç, et al.
Published: (2024-12-01)

Beyond contrastive learning: adaptive graph representations with mutual information maximization for blockchain and structured data
by: Yifeng Zhang, et al.
Published: (2025-08-01)

MoCoUTRL: a momentum contrastive framework for unsupervised text representation learning
by: Ao Zou, et al.
Published: (2023-12-01)

Representations of the hero character in children's theater texts
by: Venus Hamid Mohammed Jawad
Published: (2025-03-01)

PCVR: a pre-trained contextualized visual representation for DNA sequence classification
by: Jiarui Zhou, et al.
Published: (2025-05-01)

Self-supervised speech representation learning based on positive sample comparison and masking reconstruction
by: Wenlin ZHANG, et al.
Published: (2022-07-01)

Self-supervised speech representation learning based on positive sample comparison and masking reconstruction
by: Wenlin ZHANG, et al.
Published: (2022-07-01)

Contextual Fine-Tuning of Language Models with Classifier-Driven Content Moderation for Text Generation
by: Matan Punnaivanam, et al.
Published: (2024-12-01)

LoRA-Adv: Boosting Text Classification in Large Language Models Through Adversarial Low-Rank Adaptations
by: Hong Ye, et al.
Published: (2025-01-01)

Bone tumor recognition strategy based on object region and context representation in medical decision-making system
by: Yueguang Liu, et al.
Published: (2025-03-01)

Study of the Application of Text Augmentation with Paraphrasing to Overcome Imbalanced Data in Indonesian Text Classification
by: Mutiara Indryan Sari, et al.
Published: (2025-04-01)

SegRep: Mask-Supervised Learning for Segment Representation in Pathology Images
by: Chichun Yang, et al.
Published: (2024-01-01)

Interval Prediction of Landslide Displacement Using a Pretrained Large Time Series Model Based on Large-scale Cross-domain Data
by: Xuhuang Du, et al.
Published: (2025-05-01)

Aircraft Trajectory Segmentation-Based Contrastive Coding: A Framework for Self-Supervised Trajectory Representation
by: Thaweerath Phisannupawong, et al.
Published: (2025-01-01)

Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning
by: Aristotelis Ballas, et al.
Published: (2024-01-01)

Identification of Scientific Texts Generated by Large Language Models Using Machine Learning
by: David Soto-Osorio, et al.
Published: (2024-12-01)

Mastitis Classification in Dairy Cows Using Weakly Supervised Representation Learning
by: Soo-Hyun Cho, et al.
Published: (2024-11-01)

The Effectiveness of Large Language Models in Transforming Unstructured Text to Standardized Formats
by: William Brach, et al.
Published: (2025-01-01)

Multi-modal representation learning in retinal imaging using self-supervised learning for enhanced clinical predictions
by: Emese Sükei, et al.
Published: (2024-11-01)

New method of text representation model based on neural network
by: Shui-fei ZENG, et al.
Published: (2017-04-01)

Research on Aerospace Text Classification Based on BERT-LSTM Model
by: AN Rui, et al.
Published: (2024-08-01)

Exploring graph representation strategies for text classification
by: Henrique Varella Ehrenfried, et al.
Published: (2023-12-01)

Text Geolocation Prediction via Self-Supervised Learning
by: Yuxing Wu, et al.
Published: (2025-04-01)

Spatiotemporal masked pre-training for advancing crop mapping on satellite image time series with limited labels
by: Xiaolei Qin, et al.
Published: (2025-03-01)

Enhancing TextGCN for depression detection on social media with emotion representation
by: Huimin Mao, et al.
Published: (2025-08-01)

A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
by: Zan Qiu, et al.
Published: (2024-11-01)

Transfer Learning for Photovoltaic Power Forecasting Across Regions Using Large-Scale Datasets
by: Seongho Bak, et al.
Published: (2025-01-01)

Abstractive Summarization of Historical Documents: A New Dataset and Novel Method Using a Domain-Specific Pretrained Model
by: Keerthana Murugaraj, et al.
Published: (2025-01-01)

Text Classification: How Machine Learning Is Revolutionizing Text Categorization
by: Hesham Allam, et al.
Published: (2025-02-01)

Temporal and spatial self supervised learning methods for electrocardiograms
by: Wenping Chen, et al.
Published: (2025-02-01)