Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
The recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, edu...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/14/7640 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849714807367794688 |
|---|---|
| author | Md Masum Billah Harry Setiawan Hamjaya Hakima Shiralizade Vandita Singh Rafia Inam |
| author_facet | Md Masum Billah Harry Setiawan Hamjaya Hakima Shiralizade Vandita Singh Rafia Inam |
| author_sort | Md Masum Billah |
| collection | DOAJ |
| description | The recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, education, automated driving, and smart factories. This often leads to an increased level of complexity and challenges concerning the trustworthiness of these models, such as the generation of toxic content and hallucinations with high confidence leading to serious consequences. The European Union Artificial Intelligence Act (AI Act) is a regulation concerning artificial intelligence. The EU AI Act has proposed a comprehensive set of guidelines to ensure the responsible usage and development of general-purpose AI systems (such as LLMs) that may pose potential risks. The need arises for strengthened efforts to ensure that these high-performing LLMs adhere to the seven trustworthiness aspects (data governance, record-keeping, transparency, human-oversight, accuracy, robustness, and cybersecurity) recommended by the AI Act. Our study systematically maps research, focusing on identifying the key trends in developing LLMs across different application domains to address the aspects of AI Act-based trustworthiness. Our study reveals the recent trends that indicate a growing interest in emerging models such as LLaMa and BARD, reflecting a shift in research priorities. GPT and BERT remain the most studied models, and newer alternatives like Mistral and Claude remain underexplored. Trustworthiness aspects like accuracy and transparency dominate the research landscape, while cybersecurity and record-keeping remain significantly underexamined. Our findings highlight the urgent need for a more balanced, interdisciplinary research approach to ensure LLM trustworthiness across diverse applications. Expanding studies into underexplored, high-risk domains and fostering cross-sector collaboration can bridge existing gaps. Furthermore, this study also reveals domains (like telecommunication) which are underrepresented, presenting considerable research gaps and indicating a potential direction for the way forward. |
| format | Article |
| id | doaj-art-475d71b7d2ce4ef59da9ee0abc35709a |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-475d71b7d2ce4ef59da9ee0abc35709a2025-08-20T03:13:36ZengMDPI AGApplied Sciences2076-34172025-07-011514764010.3390/app15147640Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping StudyMd Masum Billah0Harry Setiawan Hamjaya1Hakima Shiralizade2Vandita Singh3Rafia Inam4Ericsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenThe recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, education, automated driving, and smart factories. This often leads to an increased level of complexity and challenges concerning the trustworthiness of these models, such as the generation of toxic content and hallucinations with high confidence leading to serious consequences. The European Union Artificial Intelligence Act (AI Act) is a regulation concerning artificial intelligence. The EU AI Act has proposed a comprehensive set of guidelines to ensure the responsible usage and development of general-purpose AI systems (such as LLMs) that may pose potential risks. The need arises for strengthened efforts to ensure that these high-performing LLMs adhere to the seven trustworthiness aspects (data governance, record-keeping, transparency, human-oversight, accuracy, robustness, and cybersecurity) recommended by the AI Act. Our study systematically maps research, focusing on identifying the key trends in developing LLMs across different application domains to address the aspects of AI Act-based trustworthiness. Our study reveals the recent trends that indicate a growing interest in emerging models such as LLaMa and BARD, reflecting a shift in research priorities. GPT and BERT remain the most studied models, and newer alternatives like Mistral and Claude remain underexplored. Trustworthiness aspects like accuracy and transparency dominate the research landscape, while cybersecurity and record-keeping remain significantly underexamined. Our findings highlight the urgent need for a more balanced, interdisciplinary research approach to ensure LLM trustworthiness across diverse applications. Expanding studies into underexplored, high-risk domains and fostering cross-sector collaboration can bridge existing gaps. Furthermore, this study also reveals domains (like telecommunication) which are underrepresented, presenting considerable research gaps and indicating a potential direction for the way forward.https://www.mdpi.com/2076-3417/15/14/7640large language models (LLMs)trustworthinessEU AI ActGPTBERTLLaMa |
| spellingShingle | Md Masum Billah Harry Setiawan Hamjaya Hakima Shiralizade Vandita Singh Rafia Inam Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study Applied Sciences large language models (LLMs) trustworthiness EU AI Act GPT BERT LLaMa |
| title | Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study |
| title_full | Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study |
| title_fullStr | Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study |
| title_full_unstemmed | Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study |
| title_short | Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study |
| title_sort | large language models trustworthiness in the light of the eu ai act a systematic mapping study |
| topic | large language models (LLMs) trustworthiness EU AI Act GPT BERT LLaMa |
| url | https://www.mdpi.com/2076-3417/15/14/7640 |
| work_keys_str_mv | AT mdmasumbillah largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT harrysetiawanhamjaya largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT hakimashiralizade largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT vanditasingh largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT rafiainam largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy |