Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction

Driving behavior prediction has become increasingly important, owing to rapid advancements in autonomous vehicle technologies. Inverse reinforcement learning (IRL) has emerged as a leading approach domain, as it allows the inference of underlying reward functions from human-driving demonstrations, e...

Full description

Saved in:
Bibliographic Details
Main Authors: Minglu Zhao, Masamichi Shimosaka
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11006073/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850269604189831168
author Minglu Zhao
Masamichi Shimosaka
author_facet Minglu Zhao
Masamichi Shimosaka
author_sort Minglu Zhao
collection DOAJ
description Driving behavior prediction has become increasingly important, owing to rapid advancements in autonomous vehicle technologies. Inverse reinforcement learning (IRL) has emerged as a leading approach domain, as it allows the inference of underlying reward functions from human-driving demonstrations, enabling the modeling of complex behaviors. Among various IRL approaches, maximum-entropy IRL (MaxEnt IRL) has gained prominence in driving-behavior modeling owing to its applicability to continuous state-space problems. In continuous state-space MaxEnt IRL, stochastic motion planners are commonly employed during training to approximate the partition function because integrating over the entire state-space is computationally infeasible. However, traditional motion planners often fail to efficiently explore the state-space surrounding human-demonstrated paths, leading to inaccurate approximations of the partition function. This study proposes a novel, stable MaxEnt IRL framework that integrates an IRL-aware motion planner incorporating a guided exploration and exploitation process to efficiently sample high-quality trajectories. The proposed approach leverages two distributions derived from human-demonstrated paths to balance broad state-space exploration and targeted exploitation of relevant regions. Experiments conducted in a driving simulator demonstrate that the proposed method outperforms existing IRL methods in stability and accuracy, enhancing the prediction of driving behaviors and showcasing the potential of IRL for achieving human-like decision-making in autonomous driving.
format Article
id doaj-art-ff1b0e33173443e2be7166ad58b7a3ee
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-ff1b0e33173443e2be7166ad58b7a3ee2025-08-20T01:53:04ZengIEEEIEEE Access2169-35362025-01-0113873138732610.1109/ACCESS.2025.357095711006073Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior PredictionMinglu Zhao0https://orcid.org/0009-0005-2788-1025Masamichi Shimosaka1https://orcid.org/0000-0003-0558-2006Department of Computer Science, Institute of Science Tokyo, Tokyo, JapanDepartment of Computer Science, Institute of Science Tokyo, Tokyo, JapanDriving behavior prediction has become increasingly important, owing to rapid advancements in autonomous vehicle technologies. Inverse reinforcement learning (IRL) has emerged as a leading approach domain, as it allows the inference of underlying reward functions from human-driving demonstrations, enabling the modeling of complex behaviors. Among various IRL approaches, maximum-entropy IRL (MaxEnt IRL) has gained prominence in driving-behavior modeling owing to its applicability to continuous state-space problems. In continuous state-space MaxEnt IRL, stochastic motion planners are commonly employed during training to approximate the partition function because integrating over the entire state-space is computationally infeasible. However, traditional motion planners often fail to efficiently explore the state-space surrounding human-demonstrated paths, leading to inaccurate approximations of the partition function. This study proposes a novel, stable MaxEnt IRL framework that integrates an IRL-aware motion planner incorporating a guided exploration and exploitation process to efficiently sample high-quality trajectories. The proposed approach leverages two distributions derived from human-demonstrated paths to balance broad state-space exploration and targeted exploitation of relevant regions. Experiments conducted in a driving simulator demonstrate that the proposed method outperforms existing IRL methods in stability and accuracy, enhancing the prediction of driving behaviors and showcasing the potential of IRL for achieving human-like decision-making in autonomous driving.https://ieeexplore.ieee.org/document/11006073/Inverse reinforcement learningguided motion planningdriving behavior prediction
spellingShingle Minglu Zhao
Masamichi Shimosaka
Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
IEEE Access
Inverse reinforcement learning
guided motion planning
driving behavior prediction
title Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
title_full Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
title_fullStr Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
title_full_unstemmed Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
title_short Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
title_sort stable inverse reinforcement learning via leveraged guided motion planner for driving behavior prediction
topic Inverse reinforcement learning
guided motion planning
driving behavior prediction
url https://ieeexplore.ieee.org/document/11006073/
work_keys_str_mv AT mingluzhao stableinversereinforcementlearningvialeveragedguidedmotionplannerfordrivingbehaviorprediction
AT masamichishimosaka stableinversereinforcementlearningvialeveragedguidedmotionplannerfordrivingbehaviorprediction