The Promises and Pitfalls of Large Language Models as Feedback Providers: A Study of Prompt Engineering and the Quality of AI-Driven Feedback

Background/Objectives: Artificial intelligence (AI) is transforming higher education (HE), reshaping teaching, learning, and feedback processes. Feedback generated by large language models (LLMs) has shown potential for enhancing student learning outcomes. However, few empirical studies have directl...

Full description

Saved in:
Bibliographic Details
Main Authors: Lucas Jasper Jacobsen, Kira Elena Weber
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:AI
Subjects:
Online Access:https://www.mdpi.com/2673-2688/6/2/35
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background/Objectives: Artificial intelligence (AI) is transforming higher education (HE), reshaping teaching, learning, and feedback processes. Feedback generated by large language models (LLMs) has shown potential for enhancing student learning outcomes. However, few empirical studies have directly compared the quality of LLM feedback with feedback from novices and experts. This study investigates (1) the types of prompts needed to ensure high-quality LLM feedback in teacher education and (2) how feedback from novices, experts, and LLMs compares in terms of quality. Methods: To address these questions, we developed a theory-driven manual to evaluate prompt quality and designed three prompts of varying quality. Feedback generated by ChatGPT-4 was assessed alongside feedback from novices and experts, who were provided with the highest-quality prompt. Results: Our findings reveal that only the best prompt consistently produced high-quality feedback. Additionally, LLM feedback outperformed novice feedback and, in the categories explanation, questions, and specificity, even surpassed expert feedback in quality while being generated more quickly. Conclusions: These results suggest that LLMs, when guided by well-crafted prompts, can serve as high-quality and efficient alternatives to expert feedback. The findings underscore the importance of prompt quality and emphasize the need for prompt design guidelines to maximize the potential of LLMs in teacher education.
ISSN:2673-2688