Large Language Models for Automated Grading and Synthetic Data Generation in Communication-Based Training Assessment
Effective communication is critical in high-stakes tasks, particularly in scenarios requiring precision and coordination under time pressure. Here, we explore the potential of large language models (LLMs) to evaluate communication performance and generate synthetic conversation data for training and...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
LibraryPress@UF
2025-05-01
|
| Series: | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| Subjects: | |
| Online Access: | https://journals.flvc.org/FLAIRS/article/view/138876 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Effective communication is critical in high-stakes tasks, particularly in scenarios requiring precision and coordination under time pressure. Here, we explore the potential of large language models (LLMs) to evaluate communication performance and generate synthetic conversation data for training and assessment purposes. We present a proof-of-concept study focused on a highly structured task: the interaction between a forward observer and a fire direction center during a call for fire mission. Using a rubric-based approach, the LLM graded transcripts of forward observer communications, distinguishing between varying levels of trainee performance with high reliability and alignment to expected outcomes. Additionally, we demonstrate the utility of LLMs in generating synthetic transcripts that simulate varying performance levels. While this study is centered on the call for fire, the approach has broader implications for training assessment in complex, communication intensive tasks. Our results suggest that LLMs can serve as effective tools for both grading and data generation, enabling scalable solutions for improving performance in high-stakes domains.
|
|---|---|
| ISSN: | 2334-0754 2334-0762 |