Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-Dimensional Tokens

Current progress in artificial intelligence is centered around so-called large language models that consist of neural networks processing long sequences of high-dimensional vectors called tokens. Statistical physics provides powerful tools to study the functioning of learning with neural networks an...

Full description

Saved in:
Bibliographic Details
Main Authors: Vittorio Erba, Emanuele Troiani, Luca Biggio, Antoine Maillard, Lenka Zdeborová
Format: Article
Language:English
Published: American Physical Society 2025-06-01
Series:Physical Review X
Online Access:http://doi.org/10.1103/l4p2-vrxt
Tags: Add Tag
No Tags, Be the first to tag this record!