LARE: Latent augmentation using regional embedding with vision-language model

In recent years, considerable research has been conducted on vision-language models (VLMs) that handle both image and text data; these models are being applied to diverse downstream tasks, such as “image-related chat,” “image recognition by instruction,” and “answering visual questions.” Vision-lang...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kosuke Sakurai, Tatsuya Ishii, Ryotaro Shimizu, Linxin Song, Masayuki Goto
Format:	Article
Language:	English
Published:	Elsevier 2025-06-01
Series:	Machine Learning with Applications
Subjects:	Image classification Data augmentation Domain adaptation Regional embedding Vision-language model
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666827025000544
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

http://www.sciencedirect.com/science/article/pii/S2666827025000544

LARE: Latent augmentation using regional embedding with vision-language model

Internet

Similar Items