LLM-Guided Reinforcement Learning for Interactive Environments

We propose herein <b>LLM-Guided Reinforcement Learning (LGRL)</b>, a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-leve...

Full description

Saved in:
Bibliographic Details
Main Authors: Fuxue Yang, Jiawen Liu, Kan Li
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/12/1932
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850168011313381376
author Fuxue Yang
Jiawen Liu
Kan Li
author_facet Fuxue Yang
Jiawen Liu
Kan Li
author_sort Fuxue Yang
collection DOAJ
description We propose herein <b>LLM-Guided Reinforcement Learning (LGRL)</b>, a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-level action execution by dynamically generating context-aware subgoals that guide the reinforcement learning (RL) agent. During training, intermediate subgoals—each associated with partial rewards—are generated based on the agent’s current progress, providing fine-grained feedback that facilitates structured exploration and accelerates convergence. At inference, a chain-of-thought strategy is employed, enabling the LLM to adaptively update subgoals in response to evolving environmental states. Although demonstrated on a representative interactive setting, our method is generalizable to a wide range of complex, goal-oriented tasks. Experimental results show that <b>LGRL</b> achieves higher success rates, improved efficiency, and faster convergence compared to baseline approaches.
format Article
id doaj-art-e7328792b12645c08590de7ce217f741
institution OA Journals
issn 2227-7390
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-e7328792b12645c08590de7ce217f7412025-08-20T02:21:04ZengMDPI AGMathematics2227-73902025-06-011312193210.3390/math13121932LLM-Guided Reinforcement Learning for Interactive EnvironmentsFuxue Yang0Jiawen Liu1Kan Li2School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaWe propose herein <b>LLM-Guided Reinforcement Learning (LGRL)</b>, a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-level action execution by dynamically generating context-aware subgoals that guide the reinforcement learning (RL) agent. During training, intermediate subgoals—each associated with partial rewards—are generated based on the agent’s current progress, providing fine-grained feedback that facilitates structured exploration and accelerates convergence. At inference, a chain-of-thought strategy is employed, enabling the LLM to adaptively update subgoals in response to evolving environmental states. Although demonstrated on a representative interactive setting, our method is generalizable to a wide range of complex, goal-oriented tasks. Experimental results show that <b>LGRL</b> achieves higher success rates, improved efficiency, and faster convergence compared to baseline approaches.https://www.mdpi.com/2227-7390/13/12/1932reinforcement learninglarge language modelschain of thought
spellingShingle Fuxue Yang
Jiawen Liu
Kan Li
LLM-Guided Reinforcement Learning for Interactive Environments
Mathematics
reinforcement learning
large language models
chain of thought
title LLM-Guided Reinforcement Learning for Interactive Environments
title_full LLM-Guided Reinforcement Learning for Interactive Environments
title_fullStr LLM-Guided Reinforcement Learning for Interactive Environments
title_full_unstemmed LLM-Guided Reinforcement Learning for Interactive Environments
title_short LLM-Guided Reinforcement Learning for Interactive Environments
title_sort llm guided reinforcement learning for interactive environments
topic reinforcement learning
large language models
chain of thought
url https://www.mdpi.com/2227-7390/13/12/1932
work_keys_str_mv AT fuxueyang llmguidedreinforcementlearningforinteractiveenvironments
AT jiawenliu llmguidedreinforcementlearningforinteractiveenvironments
AT kanli llmguidedreinforcementlearningforinteractiveenvironments