LLM-Guided Reinforcement Learning for Interactive Environments
We propose herein <b>LLM-Guided Reinforcement Learning (LGRL)</b>, a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-leve...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/12/1932 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850168011313381376 |
|---|---|
| author | Fuxue Yang Jiawen Liu Kan Li |
| author_facet | Fuxue Yang Jiawen Liu Kan Li |
| author_sort | Fuxue Yang |
| collection | DOAJ |
| description | We propose herein <b>LLM-Guided Reinforcement Learning (LGRL)</b>, a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-level action execution by dynamically generating context-aware subgoals that guide the reinforcement learning (RL) agent. During training, intermediate subgoals—each associated with partial rewards—are generated based on the agent’s current progress, providing fine-grained feedback that facilitates structured exploration and accelerates convergence. At inference, a chain-of-thought strategy is employed, enabling the LLM to adaptively update subgoals in response to evolving environmental states. Although demonstrated on a representative interactive setting, our method is generalizable to a wide range of complex, goal-oriented tasks. Experimental results show that <b>LGRL</b> achieves higher success rates, improved efficiency, and faster convergence compared to baseline approaches. |
| format | Article |
| id | doaj-art-e7328792b12645c08590de7ce217f741 |
| institution | OA Journals |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-e7328792b12645c08590de7ce217f7412025-08-20T02:21:04ZengMDPI AGMathematics2227-73902025-06-011312193210.3390/math13121932LLM-Guided Reinforcement Learning for Interactive EnvironmentsFuxue Yang0Jiawen Liu1Kan Li2School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaWe propose herein <b>LLM-Guided Reinforcement Learning (LGRL)</b>, a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-level action execution by dynamically generating context-aware subgoals that guide the reinforcement learning (RL) agent. During training, intermediate subgoals—each associated with partial rewards—are generated based on the agent’s current progress, providing fine-grained feedback that facilitates structured exploration and accelerates convergence. At inference, a chain-of-thought strategy is employed, enabling the LLM to adaptively update subgoals in response to evolving environmental states. Although demonstrated on a representative interactive setting, our method is generalizable to a wide range of complex, goal-oriented tasks. Experimental results show that <b>LGRL</b> achieves higher success rates, improved efficiency, and faster convergence compared to baseline approaches.https://www.mdpi.com/2227-7390/13/12/1932reinforcement learninglarge language modelschain of thought |
| spellingShingle | Fuxue Yang Jiawen Liu Kan Li LLM-Guided Reinforcement Learning for Interactive Environments Mathematics reinforcement learning large language models chain of thought |
| title | LLM-Guided Reinforcement Learning for Interactive Environments |
| title_full | LLM-Guided Reinforcement Learning for Interactive Environments |
| title_fullStr | LLM-Guided Reinforcement Learning for Interactive Environments |
| title_full_unstemmed | LLM-Guided Reinforcement Learning for Interactive Environments |
| title_short | LLM-Guided Reinforcement Learning for Interactive Environments |
| title_sort | llm guided reinforcement learning for interactive environments |
| topic | reinforcement learning large language models chain of thought |
| url | https://www.mdpi.com/2227-7390/13/12/1932 |
| work_keys_str_mv | AT fuxueyang llmguidedreinforcementlearningforinteractiveenvironments AT jiawenliu llmguidedreinforcementlearningforinteractiveenvironments AT kanli llmguidedreinforcementlearningforinteractiveenvironments |