Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
The recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, edu...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/14/7640 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, education, automated driving, and smart factories. This often leads to an increased level of complexity and challenges concerning the trustworthiness of these models, such as the generation of toxic content and hallucinations with high confidence leading to serious consequences. The European Union Artificial Intelligence Act (AI Act) is a regulation concerning artificial intelligence. The EU AI Act has proposed a comprehensive set of guidelines to ensure the responsible usage and development of general-purpose AI systems (such as LLMs) that may pose potential risks. The need arises for strengthened efforts to ensure that these high-performing LLMs adhere to the seven trustworthiness aspects (data governance, record-keeping, transparency, human-oversight, accuracy, robustness, and cybersecurity) recommended by the AI Act. Our study systematically maps research, focusing on identifying the key trends in developing LLMs across different application domains to address the aspects of AI Act-based trustworthiness. Our study reveals the recent trends that indicate a growing interest in emerging models such as LLaMa and BARD, reflecting a shift in research priorities. GPT and BERT remain the most studied models, and newer alternatives like Mistral and Claude remain underexplored. Trustworthiness aspects like accuracy and transparency dominate the research landscape, while cybersecurity and record-keeping remain significantly underexamined. Our findings highlight the urgent need for a more balanced, interdisciplinary research approach to ensure LLM trustworthiness across diverse applications. Expanding studies into underexplored, high-risk domains and fostering cross-sector collaboration can bridge existing gaps. Furthermore, this study also reveals domains (like telecommunication) which are underrepresented, presenting considerable research gaps and indicating a potential direction for the way forward. |
|---|---|
| ISSN: | 2076-3417 |