From Data to Deployment: A Comprehensive Analysis of Risks in Large Language Model Research and Development
Large language models (LLMs) have evolved significantly, achieving unprecedented linguistic capabilities that underpin a wide range of AI applications. However, they also pose risks and challenges such as ethical concerns, bias and computational sustainability. How to balance the high performance in...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2025-01-01
|
| Series: | IET Information Security |
| Online Access: | http://dx.doi.org/10.1049/ise2/7358963 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Large language models (LLMs) have evolved significantly, achieving unprecedented linguistic capabilities that underpin a wide range of AI applications. However, they also pose risks and challenges such as ethical concerns, bias and computational sustainability. How to balance the high performance in revolutionising information processing with the risks they pose is critical to their future development. LLM is a type of NLP model and many of the LLM risks are also risks that NLP has experienced in the past. We, therefore, summarise these risks, focusing more on the underlying understanding of these risks/technical tools, rather than simply describing their occurrence in LLM. In this paper, we first discuss and compare the current state of research on the four main risks in the process of developing LLMs: data, system, pretraining and inference, and then, try to summarise the rationale, complexity, prospects and challenges of the key issues and challenges in each phase. Finally, this review concludes with a discussion of the fundamental issues that should be of most concern and risk and that should be addressed in the early stages of modelling research, including the correlated issues of privacy preservation and countering attacks and model robustness. Based on the LLM research and development (R&D) process perspective, this review summarises the actual risks and provides guidance for research directions, with the aim of helping researchers to identify these risk points and technology directions worth investigating, as well as helping to establish a safe and efficient R&D process. |
|---|---|
| ISSN: | 1751-8717 |