Attention is All Large Language Model Need
With the advent of the Transformer, the attention mechanism has been applied to Large Language Model (LLM), evolving from initial single- modal large models to today's multi-modal large models. This has greatly propelled the development of Artificial Intelligence (AI) and ushered humans into th...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
EDP Sciences
2025-01-01
|
| Series: | ITM Web of Conferences |
| Online Access: | https://www.itm-conferences.org/articles/itmconf/pdf/2025/04/itmconf_iwadi2024_02025.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | With the advent of the Transformer, the attention mechanism has been applied to Large Language Model (LLM), evolving from initial single- modal large models to today's multi-modal large models. This has greatly propelled the development of Artificial Intelligence (AI) and ushered humans into the era of large models. Single-modal large models can be broadly categorized into three types based on their application domains: Text LLM for Natural Language Processing (NLP), Image LLM for Computer Vision (CV), and Audio LLM for speech interaction. Multi-modal large models, on the other hand, can leverage multiple data sources simultaneously to optimize the model. This article also introduces the training process of the GPT series. Large models have also had a significant impact on industry and society, bringing with them a number of unresolved problems. The purpose of this article is to assist researchers in comprehending the various forms of LLM, as well as its development, pre- training architecture, difficulties, and future objectives. |
|---|---|
| ISSN: | 2271-2097 |