Descending an inclined plane with a large language model

[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to O...

Full description

Saved in:

Bibliographic Details
Main Authors:	Justin C. Dunlap, Ryan Sissons, Ralf Widenhorn
Format:	Article
Language:	English
Published:	American Physical Society 2025-05-01
Series:	Physical Review Physics Education Research
Online Access:	http://doi.org/10.1103/PhysRevPhysEducRes.21.010153
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850149651763691520
author	Justin C. Dunlap Ryan Sissons Ralf Widenhorn
author_facet	Justin C. Dunlap Ryan Sissons Ralf Widenhorn
author_sort	Justin C. Dunlap
collection	DOAJ
description	[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to OpenAI’s GPT-4 large language model (LLM). We investigate how different permutations of object, action verb, and property of the incline impact the responses of the LLM. The problem setup and prompting was left purposefully minimal, requiring the LLM to state multiple assumptions to justify the final answer. We specifically studied how different keywords lead the LLM to analyze the system as rolling versus sliding and how this may be different from physics experts and novice learners. We found that domain-specific terminology may impact the LLM differently from students. Even for correct answers, it generally did not specify the necessary assumptions it needed to state to come to this solution, falling short of what would be expected from an expert instructor. When conflicting information was provided, the LLM generally did not indicate that was the case in its responses. Both issues are weaknesses that could be remedied by additional prompting; however, they remain shortcomings in the context of physics teaching. While specific to introductory physics, this study provides insight into how LLMs respond to variations of a problem within a specific topic area and how their strengths and weaknesses may differ from those of humans. Understanding these differences, and tracking them as LLMs change in their capabilities, is crucial for assessing the impact of artificial intelligence on education.
format	Article
id	doaj-art-d397e79bde064716ae4c7dc511bc8386
institution	OA Journals
issn	2469-9896
language	English
publishDate	2025-05-01
publisher	American Physical Society
record_format	Article
series	Physical Review Physics Education Research
spelling	doaj-art-d397e79bde064716ae4c7dc511bc83862025-08-20T02:26:50ZengAmerican Physical SocietyPhysical Review Physics Education Research2469-98962025-05-0121101015310.1103/PhysRevPhysEducRes.21.010153Descending an inclined plane with a large language modelJustin C. DunlapRyan SissonsRalf Widenhorn[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to OpenAI’s GPT-4 large language model (LLM). We investigate how different permutations of object, action verb, and property of the incline impact the responses of the LLM. The problem setup and prompting was left purposefully minimal, requiring the LLM to state multiple assumptions to justify the final answer. We specifically studied how different keywords lead the LLM to analyze the system as rolling versus sliding and how this may be different from physics experts and novice learners. We found that domain-specific terminology may impact the LLM differently from students. Even for correct answers, it generally did not specify the necessary assumptions it needed to state to come to this solution, falling short of what would be expected from an expert instructor. When conflicting information was provided, the LLM generally did not indicate that was the case in its responses. Both issues are weaknesses that could be remedied by additional prompting; however, they remain shortcomings in the context of physics teaching. While specific to introductory physics, this study provides insight into how LLMs respond to variations of a problem within a specific topic area and how their strengths and weaknesses may differ from those of humans. Understanding these differences, and tracking them as LLMs change in their capabilities, is crucial for assessing the impact of artificial intelligence on education.http://doi.org/10.1103/PhysRevPhysEducRes.21.010153
spellingShingle	Justin C. Dunlap Ryan Sissons Ralf Widenhorn Descending an inclined plane with a large language model Physical Review Physics Education Research
title	Descending an inclined plane with a large language model
title_full	Descending an inclined plane with a large language model
title_fullStr	Descending an inclined plane with a large language model
title_full_unstemmed	Descending an inclined plane with a large language model
title_short	Descending an inclined plane with a large language model
title_sort	descending an inclined plane with a large language model
url	http://doi.org/10.1103/PhysRevPhysEducRes.21.010153
work_keys_str_mv	AT justincdunlap descendinganinclinedplanewithalargelanguagemodel AT ryansissons descendinganinclinedplanewithalargelanguagemodel AT ralfwidenhorn descendinganinclinedplanewithalargelanguagemodel

Descending an inclined plane with a large language model

Similar Items