Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study

The recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, edu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Md Masum Billah, Harry Setiawan Hamjaya, Hakima Shiralizade, Vandita Singh, Rafia Inam
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Applied Sciences
Subjects:	large language models (LLMs) trustworthiness EU AI Act GPT BERT LLaMa
Online Access:	https://www.mdpi.com/2076-3417/15/14/7640
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849714807367794688
author	Md Masum Billah Harry Setiawan Hamjaya Hakima Shiralizade Vandita Singh Rafia Inam
author_facet	Md Masum Billah Harry Setiawan Hamjaya Hakima Shiralizade Vandita Singh Rafia Inam
author_sort	Md Masum Billah
collection	DOAJ
description	The recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, education, automated driving, and smart factories. This often leads to an increased level of complexity and challenges concerning the trustworthiness of these models, such as the generation of toxic content and hallucinations with high confidence leading to serious consequences. The European Union Artificial Intelligence Act (AI Act) is a regulation concerning artificial intelligence. The EU AI Act has proposed a comprehensive set of guidelines to ensure the responsible usage and development of general-purpose AI systems (such as LLMs) that may pose potential risks. The need arises for strengthened efforts to ensure that these high-performing LLMs adhere to the seven trustworthiness aspects (data governance, record-keeping, transparency, human-oversight, accuracy, robustness, and cybersecurity) recommended by the AI Act. Our study systematically maps research, focusing on identifying the key trends in developing LLMs across different application domains to address the aspects of AI Act-based trustworthiness. Our study reveals the recent trends that indicate a growing interest in emerging models such as LLaMa and BARD, reflecting a shift in research priorities. GPT and BERT remain the most studied models, and newer alternatives like Mistral and Claude remain underexplored. Trustworthiness aspects like accuracy and transparency dominate the research landscape, while cybersecurity and record-keeping remain significantly underexamined. Our findings highlight the urgent need for a more balanced, interdisciplinary research approach to ensure LLM trustworthiness across diverse applications. Expanding studies into underexplored, high-risk domains and fostering cross-sector collaboration can bridge existing gaps. Furthermore, this study also reveals domains (like telecommunication) which are underrepresented, presenting considerable research gaps and indicating a potential direction for the way forward.
format	Article
id	doaj-art-475d71b7d2ce4ef59da9ee0abc35709a
institution	DOAJ
issn	2076-3417
language	English
publishDate	2025-07-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-475d71b7d2ce4ef59da9ee0abc35709a2025-08-20T03:13:36ZengMDPI AGApplied Sciences2076-34172025-07-011514764010.3390/app15147640Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping StudyMd Masum Billah0Harry Setiawan Hamjaya1Hakima Shiralizade2Vandita Singh3Rafia Inam4Ericsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenEricsson Research, Trustworthy AI, 16483 Stockholm, SwedenThe recent advancements and emergence of rapidly evolving AI models, such as large language models (LLMs), have sparked interest among researchers and professionals. These models are ubiquitously being fine-tuned and applied across various fields such as healthcare, customer service and support, education, automated driving, and smart factories. This often leads to an increased level of complexity and challenges concerning the trustworthiness of these models, such as the generation of toxic content and hallucinations with high confidence leading to serious consequences. The European Union Artificial Intelligence Act (AI Act) is a regulation concerning artificial intelligence. The EU AI Act has proposed a comprehensive set of guidelines to ensure the responsible usage and development of general-purpose AI systems (such as LLMs) that may pose potential risks. The need arises for strengthened efforts to ensure that these high-performing LLMs adhere to the seven trustworthiness aspects (data governance, record-keeping, transparency, human-oversight, accuracy, robustness, and cybersecurity) recommended by the AI Act. Our study systematically maps research, focusing on identifying the key trends in developing LLMs across different application domains to address the aspects of AI Act-based trustworthiness. Our study reveals the recent trends that indicate a growing interest in emerging models such as LLaMa and BARD, reflecting a shift in research priorities. GPT and BERT remain the most studied models, and newer alternatives like Mistral and Claude remain underexplored. Trustworthiness aspects like accuracy and transparency dominate the research landscape, while cybersecurity and record-keeping remain significantly underexamined. Our findings highlight the urgent need for a more balanced, interdisciplinary research approach to ensure LLM trustworthiness across diverse applications. Expanding studies into underexplored, high-risk domains and fostering cross-sector collaboration can bridge existing gaps. Furthermore, this study also reveals domains (like telecommunication) which are underrepresented, presenting considerable research gaps and indicating a potential direction for the way forward.https://www.mdpi.com/2076-3417/15/14/7640large language models (LLMs)trustworthinessEU AI ActGPTBERTLLaMa
spellingShingle	Md Masum Billah Harry Setiawan Hamjaya Hakima Shiralizade Vandita Singh Rafia Inam Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study Applied Sciences large language models (LLMs) trustworthiness EU AI Act GPT BERT LLaMa
title	Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
title_full	Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
title_fullStr	Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
title_full_unstemmed	Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
title_short	Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study
title_sort	large language models trustworthiness in the light of the eu ai act a systematic mapping study
topic	large language models (LLMs) trustworthiness EU AI Act GPT BERT LLaMa
url	https://www.mdpi.com/2076-3417/15/14/7640
work_keys_str_mv	AT mdmasumbillah largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT harrysetiawanhamjaya largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT hakimashiralizade largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT vanditasingh largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy AT rafiainam largelanguagemodelstrustworthinessinthelightoftheeuaiactasystematicmappingstudy

Large Language Models’ Trustworthiness in the Light of the EU AI Act—A Systematic Mapping Study

Similar Items