Descending an inclined plane with a large language model
[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to O...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
American Physical Society
2025-05-01
|
| Series: | Physical Review Physics Education Research |
| Online Access: | http://doi.org/10.1103/PhysRevPhysEducRes.21.010153 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850149651763691520 |
|---|---|
| author | Justin C. Dunlap Ryan Sissons Ralf Widenhorn |
| author_facet | Justin C. Dunlap Ryan Sissons Ralf Widenhorn |
| author_sort | Justin C. Dunlap |
| collection | DOAJ |
| description | [This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to OpenAI’s GPT-4 large language model (LLM). We investigate how different permutations of object, action verb, and property of the incline impact the responses of the LLM. The problem setup and prompting was left purposefully minimal, requiring the LLM to state multiple assumptions to justify the final answer. We specifically studied how different keywords lead the LLM to analyze the system as rolling versus sliding and how this may be different from physics experts and novice learners. We found that domain-specific terminology may impact the LLM differently from students. Even for correct answers, it generally did not specify the necessary assumptions it needed to state to come to this solution, falling short of what would be expected from an expert instructor. When conflicting information was provided, the LLM generally did not indicate that was the case in its responses. Both issues are weaknesses that could be remedied by additional prompting; however, they remain shortcomings in the context of physics teaching. While specific to introductory physics, this study provides insight into how LLMs respond to variations of a problem within a specific topic area and how their strengths and weaknesses may differ from those of humans. Understanding these differences, and tracking them as LLMs change in their capabilities, is crucial for assessing the impact of artificial intelligence on education. |
| format | Article |
| id | doaj-art-d397e79bde064716ae4c7dc511bc8386 |
| institution | OA Journals |
| issn | 2469-9896 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | American Physical Society |
| record_format | Article |
| series | Physical Review Physics Education Research |
| spelling | doaj-art-d397e79bde064716ae4c7dc511bc83862025-08-20T02:26:50ZengAmerican Physical SocietyPhysical Review Physics Education Research2469-98962025-05-0121101015310.1103/PhysRevPhysEducRes.21.010153Descending an inclined plane with a large language modelJustin C. DunlapRyan SissonsRalf Widenhorn[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] We present a study in which a version of a common conservation of mechanical energy introductory physics problem, an object released on an inclined plane, is given to OpenAI’s GPT-4 large language model (LLM). We investigate how different permutations of object, action verb, and property of the incline impact the responses of the LLM. The problem setup and prompting was left purposefully minimal, requiring the LLM to state multiple assumptions to justify the final answer. We specifically studied how different keywords lead the LLM to analyze the system as rolling versus sliding and how this may be different from physics experts and novice learners. We found that domain-specific terminology may impact the LLM differently from students. Even for correct answers, it generally did not specify the necessary assumptions it needed to state to come to this solution, falling short of what would be expected from an expert instructor. When conflicting information was provided, the LLM generally did not indicate that was the case in its responses. Both issues are weaknesses that could be remedied by additional prompting; however, they remain shortcomings in the context of physics teaching. While specific to introductory physics, this study provides insight into how LLMs respond to variations of a problem within a specific topic area and how their strengths and weaknesses may differ from those of humans. Understanding these differences, and tracking them as LLMs change in their capabilities, is crucial for assessing the impact of artificial intelligence on education.http://doi.org/10.1103/PhysRevPhysEducRes.21.010153 |
| spellingShingle | Justin C. Dunlap Ryan Sissons Ralf Widenhorn Descending an inclined plane with a large language model Physical Review Physics Education Research |
| title | Descending an inclined plane with a large language model |
| title_full | Descending an inclined plane with a large language model |
| title_fullStr | Descending an inclined plane with a large language model |
| title_full_unstemmed | Descending an inclined plane with a large language model |
| title_short | Descending an inclined plane with a large language model |
| title_sort | descending an inclined plane with a large language model |
| url | http://doi.org/10.1103/PhysRevPhysEducRes.21.010153 |
| work_keys_str_mv | AT justincdunlap descendinganinclinedplanewithalargelanguagemodel AT ryansissons descendinganinclinedplanewithalargelanguagemodel AT ralfwidenhorn descendinganinclinedplanewithalargelanguagemodel |