Moving LLM evaluation forward: lessons from human judgment research
This paper outlines a path toward more reliable and effective evaluation of Large Language Models (LLMs). It argues that insights from the study of human judgment and decision-making can illuminate current challenges in LLM assessment and help close critical gaps in how models are evaluated. By draw...
Saved in:
| Main Author: | Andrea Polonioli |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-05-01
|
| Series: | Frontiers in Artificial Intelligence |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/frai.2025.1592399/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
The role of generative AI in writing doctoral dissertation: perceived opportunities, challenges, and facilitating strategies to promote human agency
by: Muhammad Shaban Rafi, et al.
Published: (2025-06-01) -
Generative AI in Healthcare: Insights from Health Professions Educators and Students
by: Chaoyan Dong, et al.
Published: (2025-04-01) -
AI: An <i>Active</i> and <i>Innovative</i> Tool for Artistic Creation
by: Charis Avlonitou, et al.
Published: (2025-05-01) -
LLM Hallucination: The Curse That Cannot Be Broken
by: Hussein Al-Mahmood
Published: (2025-08-01) -
AI for social good
by: Philip Treleaven, et al.
Published: (2025-06-01)