Descending an inclined plane with a large language model

[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to O...

Full description

Saved in:
Bibliographic Details
Main Authors: Justin C. Dunlap, Ryan Sissons, Ralf Widenhorn
Format: Article
Language:English
Published: American Physical Society 2025-05-01
Series:Physical Review Physics Education Research
Online Access:http://doi.org/10.1103/PhysRevPhysEducRes.21.010153
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850149651763691520
author Justin C. Dunlap
Ryan Sissons
Ralf Widenhorn
author_facet Justin C. Dunlap
Ryan Sissons
Ralf Widenhorn
author_sort Justin C. Dunlap
collection DOAJ
description [This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to OpenAI’s GPT-4 large language model (LLM). We investigate how different permutations of object, action verb, and property of the incline impact the responses of the LLM. The problem setup and prompting was left purposefully minimal, requiring the LLM to state multiple assumptions to justify the final answer. We specifically studied how different keywords lead the LLM to analyze the system as rolling versus sliding and how this may be different from physics experts and novice learners. We found that domain-specific terminology may impact the LLM differently from students. Even for correct answers, it generally did not specify the necessary assumptions it needed to state to come to this solution, falling short of what would be expected from an expert instructor. When conflicting information was provided, the LLM generally did not indicate that was the case in its responses. Both issues are weaknesses that could be remedied by additional prompting; however, they remain shortcomings in the context of physics teaching. While specific to introductory physics, this study provides insight into how LLMs respond to variations of a problem within a specific topic area and how their strengths and weaknesses may differ from those of humans. Understanding these differences, and tracking them as LLMs change in their capabilities, is crucial for assessing the impact of artificial intelligence on education.
format Article
id doaj-art-d397e79bde064716ae4c7dc511bc8386
institution OA Journals
issn 2469-9896
language English
publishDate 2025-05-01
publisher American Physical Society
record_format Article
series Physical Review Physics Education Research
spelling doaj-art-d397e79bde064716ae4c7dc511bc83862025-08-20T02:26:50ZengAmerican Physical SocietyPhysical Review Physics Education Research2469-98962025-05-0121101015310.1103/PhysRevPhysEducRes.21.010153Descending an inclined plane with a large language modelJustin C. DunlapRyan SissonsRalf Widenhorn[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to OpenAI’s GPT-4 large language model (LLM). We investigate how different permutations of object, action verb, and property of the incline impact the responses of the LLM. The problem setup and prompting was left purposefully minimal, requiring the LLM to state multiple assumptions to justify the final answer. We specifically studied how different keywords lead the LLM to analyze the system as rolling versus sliding and how this may be different from physics experts and novice learners. We found that domain-specific terminology may impact the LLM differently from students. Even for correct answers, it generally did not specify the necessary assumptions it needed to state to come to this solution, falling short of what would be expected from an expert instructor. When conflicting information was provided, the LLM generally did not indicate that was the case in its responses. Both issues are weaknesses that could be remedied by additional prompting; however, they remain shortcomings in the context of physics teaching. While specific to introductory physics, this study provides insight into how LLMs respond to variations of a problem within a specific topic area and how their strengths and weaknesses may differ from those of humans. Understanding these differences, and tracking them as LLMs change in their capabilities, is crucial for assessing the impact of artificial intelligence on education.http://doi.org/10.1103/PhysRevPhysEducRes.21.010153
spellingShingle Justin C. Dunlap
Ryan Sissons
Ralf Widenhorn
Descending an inclined plane with a large language model
Physical Review Physics Education Research
title Descending an inclined plane with a large language model
title_full Descending an inclined plane with a large language model
title_fullStr Descending an inclined plane with a large language model
title_full_unstemmed Descending an inclined plane with a large language model
title_short Descending an inclined plane with a large language model
title_sort descending an inclined plane with a large language model
url http://doi.org/10.1103/PhysRevPhysEducRes.21.010153
work_keys_str_mv AT justincdunlap descendinganinclinedplanewithalargelanguagemodel
AT ryansissons descendinganinclinedplanewithalargelanguagemodel
AT ralfwidenhorn descendinganinclinedplanewithalargelanguagemodel