State of What Art? A Call for Multi-Prompt LLM Evaluation

State of What Art? A Call for Multi-Prompt LLM Evaluation

Saved in:

Bibliographic Details
Main Authors:	Moran Mizrahi, Guy Kaplan, Dan Malkin, Rotem Dror, Dafna Shahaf, Gabriel Stanovsky
Format:	Article
Language:	English
Published:	The MIT Press 2024-08-01
Series:	Transactions of the Association for Computational Linguistics
Online Access:	http://dx.doi.org/10.1162/tacl_a_00681
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mapping Machine Learning Trends in Chemistry Research using LLM with Multi-Turn Prompting
by: Andreo Yudertha, et al.
Published: (2025-03-01)

From Prompts to Motors: Man-in-the-Middle Attacks on LLM-Enabled Vacuum Robots
by: Asif Shaikh, et al.
Published: (2025-01-01)

Use me wisely: AI-driven assessment for LLM prompting skills development
by: Dimitri Ognibene, Gregor Donabauer, Emily Theophilou, Cansu Koyuturk, Mona Yavari, Sathya Bursic, Alessia Telari, Alessia Testa, Raffaele Boiano, Davide Taibi, Davinia Hernandez-Leo, Udo Kruschwitz and Martin Ruskov
Published: (2025-07-01)

LPITutor: an LLM based personalized intelligent tutoring system using RAG and prompt engineering
by: Zhensheng Liu, et al.
Published: (2025-08-01)

Efficient Prompt Optimization for Relevance Evaluation via LLM-Based Confusion Matrix Feedback
by: Jaekeol Choi
Published: (2025-05-01)

LLM4WM: Adapting LLM for Wireless Multi-Tasking
by: Xuanyu Liu, et al.
Published: (2025-01-01)

The art of audience engagement: LLM-based thin-slicing of scientific talks
by: Ralf Schmälzle, et al.
Published: (2025-08-01)

SONGEL, F. (2021). El arte de leer las calles. Valencia: Barlin Libros
by: Eduardo Torres Morán
Published: (2023-09-01)

What is this thing called dialetheism?
by: Abilio Rodrigues
Published: (2025-07-01)

What is this thing called gamification?
by: Jocimario Alves Pereira, et al.
Published: (2025-03-01)

What Is This Thing Called Mentoring?
by: Darío Rodríguez
Published: (2025-01-01)

Rationale for WHO's new position calling for prompt reporting and public disclosure of interventional clinical trial results.
by: Vasee S Moorthy, et al.
Published: (2015-04-01)

Mitigating LLM Hallucinations Using a Multi-Agent Framework
by: Ahmed M. Darwish, et al.
Published: (2025-06-01)

A New, Robust, Adaptive, Versatile, and Scalable Abandoned Object Detection Approach Based on DeepSORT Dynamic Prompts, and Customized LLM for Smart Video Surveillance
by: Merve Yilmazer, et al.
Published: (2025-03-01)

Unspeakably more depends on what things are called than on what they are
by: Ian Hacking
Published: (2021-06-01)

Chaotic LLM billiards
by: David Berenstein, et al.
Published: (2024-08-01)

LLM-Driven Social Influence for Cooperative Behavior in Multi-Agent Systems
by: J. de Curto, et al.
Published: (2025-01-01)

An LLM-guided platform for multi-granular collection and management of data provenance
by: Luca Gregori, et al.
Published: (2025-07-01)

A Multi-agent System Based On LLM For Trading Financial Assets
by: Simona-Vasilica Oprea, et al.
Published: (2025-02-01)

Enhancing LLM Reasoning Capabilities Through Brokered Multi-Expert Reflection
by: Tejasvee Sheokand, et al.
Published: (2025-01-01)

Controlled Release of Hydrophilic Active Agent from Textile Using Crosslinked Polyvinyl Alcohol Coatings
by: Limor Mizrahi, et al.
Published: (2025-06-01)

The fine art of fine-tuning: A structured review of advanced LLM fine-tuning techniques
by: Samar Pratap, et al.
Published: (2025-06-01)

Multi-Level Foreground Prompt for Incremental Object Detection
by: Jianwen Mo, et al.
Published: (2025-01-01)

PolyLLM: polypharmacy side effect prediction via LLM-based SMILES encodings
by: Sadra Hakim, et al.
Published: (2025-07-01)

BdSentiLLM: A Novel LLM Approach to Sentiment Analysis of Product Reviews
by: Atia Shahnaz Ipa, et al.
Published: (2024-01-01)

LLM-AIDSim: LLM-Enhanced Agent-Based Influence Diffusion Simulation in Social Networks
by: Lan Zhang, et al.
Published: (2025-01-01)

An enlightened aristocrat at the crossroads of countries and languages
by: Jaroslav Stanovský
Published: (2024-09-01)

Exploring LLM-powered multi-session human-robot interactions with university students
by: Mauliana Mauliana, et al.
Published: (2025-06-01)

LLM referential chain generation.
by: Anna-Maria De Cesare
Published: (2025-06-01)

LLM technologies and information search
by: Lin Liu, et al.
Published: (2024-11-01)

Reuse Prompts
by: Andrea Francke, et al.
Published: (2025-06-01)

What do we call pedagogy? Pessimistic essay
by: Aleksander Nalaskowski
Published: (2020-08-01)

Biometric privacy protection: What is this thing called privacy?
by: Emilio Mordini
Published: (2023-07-01)

Audio and linguistic prediction of objective and subjective cognition in older adults: what is the role of different prompts?
by: Varsha D. Badal, et al.
Published: (2025-07-01)

Performance in Law School: What Matters in the Beginning?
by: Wendy Larcombe, et al.
Published: (2008-01-01)

Fostering collective intelligence in CPSS: an LLM-driven multi-agent cooperative tuning framework
by: Rongjun Chen, et al.
Published: (2025-06-01)

A model of ensuring LLM cybersecurity
by: Oleksii Neretin, et al.
Published: (2025-05-01)

LLM Hallucination: The Curse That Cannot Be Broken
by: Hussein Al-Mahmood
Published: (2025-08-01)

To what extent are call combinations in chimpanzees comparable to syntax in humans?
by: Maël Leroux
Published: (2023-12-01)

You Reap What You Sow… A Call to Action
by: Manisha M. Khorate
Published: (2024-12-01)