Attention is All Large Language Model Need

With the advent of the Transformer, the attention mechanism has been applied to Large Language Model (LLM), evolving from initial single- modal large models to today's multi-modal large models. This has greatly propelled the development of Artificial Intelligence (AI) and ushered humans into th...

Full description

Saved in:
Bibliographic Details
Main Author: Liu Yuxin
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/04/itmconf_iwadi2024_02025.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849705510663618560
author Liu Yuxin
author_facet Liu Yuxin
author_sort Liu Yuxin
collection DOAJ
description With the advent of the Transformer, the attention mechanism has been applied to Large Language Model (LLM), evolving from initial single- modal large models to today's multi-modal large models. This has greatly propelled the development of Artificial Intelligence (AI) and ushered humans into the era of large models. Single-modal large models can be broadly categorized into three types based on their application domains: Text LLM for Natural Language Processing (NLP), Image LLM for Computer Vision (CV), and Audio LLM for speech interaction. Multi-modal large models, on the other hand, can leverage multiple data sources simultaneously to optimize the model. This article also introduces the training process of the GPT series. Large models have also had a significant impact on industry and society, bringing with them a number of unresolved problems. The purpose of this article is to assist researchers in comprehending the various forms of LLM, as well as its development, pre- training architecture, difficulties, and future objectives.
format Article
id doaj-art-00dbad444cbe4a9d9c19ae662d16dc8c
institution DOAJ
issn 2271-2097
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series ITM Web of Conferences
spelling doaj-art-00dbad444cbe4a9d9c19ae662d16dc8c2025-08-20T03:16:28ZengEDP SciencesITM Web of Conferences2271-20972025-01-01730202510.1051/itmconf/20257302025itmconf_iwadi2024_02025Attention is All Large Language Model NeedLiu Yuxin0School of Software, Shanxi Agricultural UniversityWith the advent of the Transformer, the attention mechanism has been applied to Large Language Model (LLM), evolving from initial single- modal large models to today's multi-modal large models. This has greatly propelled the development of Artificial Intelligence (AI) and ushered humans into the era of large models. Single-modal large models can be broadly categorized into three types based on their application domains: Text LLM for Natural Language Processing (NLP), Image LLM for Computer Vision (CV), and Audio LLM for speech interaction. Multi-modal large models, on the other hand, can leverage multiple data sources simultaneously to optimize the model. This article also introduces the training process of the GPT series. Large models have also had a significant impact on industry and society, bringing with them a number of unresolved problems. The purpose of this article is to assist researchers in comprehending the various forms of LLM, as well as its development, pre- training architecture, difficulties, and future objectives.https://www.itm-conferences.org/articles/itmconf/pdf/2025/04/itmconf_iwadi2024_02025.pdf
spellingShingle Liu Yuxin
Attention is All Large Language Model Need
ITM Web of Conferences
title Attention is All Large Language Model Need
title_full Attention is All Large Language Model Need
title_fullStr Attention is All Large Language Model Need
title_full_unstemmed Attention is All Large Language Model Need
title_short Attention is All Large Language Model Need
title_sort attention is all large language model need
url https://www.itm-conferences.org/articles/itmconf/pdf/2025/04/itmconf_iwadi2024_02025.pdf
work_keys_str_mv AT liuyuxin attentionisalllargelanguagemodelneed