A multimodal transformer-based tool for automatic generation of concreteness ratings across languages

Abstract We present an automated method for generating concreteness ratings that achieves beyond human-level reliability across multiple languages and expression types. Our approach combines multimodal transformers with emotion-finetuned language models and achieves correlations of 0.93 for single B...

Full description

Saved in:
Bibliographic Details
Main Authors: Viktor Kewenig, Jeremy I. Skipper, Gabriella Vigliocco
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Communications Psychology
Online Access:https://doi.org/10.1038/s44271-025-00280-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract We present an automated method for generating concreteness ratings that achieves beyond human-level reliability across multiple languages and expression types. Our approach combines multimodal transformers with emotion-finetuned language models and achieves correlations of 0.93 for single British words and 0.85 for multiword expressions with existing corpora of human raters. We demonstrate general applicability through successful cross-lingual generalization to an entirely unseen corpus of Estonian single- and multi-word expressions (N = 35,979), achieved via automated language detection and translation. By leveraging both visual and emotional information in context-aware language embeddings, our method effectively captures the full spectrum from concrete to abstract concepts. Our automated system offers a context sensitive, reliable alternative to traditional human ratings, eliminating the need for time-consuming and costly human rating collection. We provide an easy to access web-based interface for research to use our tool under concreteness.eu .
ISSN:2731-9121