Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction
Driving behavior prediction has become increasingly important, owing to rapid advancements in autonomous vehicle technologies. Inverse reinforcement learning (IRL) has emerged as a leading approach domain, as it allows the inference of underlying reward functions from human-driving demonstrations, e...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11006073/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850269604189831168 |
|---|---|
| author | Minglu Zhao Masamichi Shimosaka |
| author_facet | Minglu Zhao Masamichi Shimosaka |
| author_sort | Minglu Zhao |
| collection | DOAJ |
| description | Driving behavior prediction has become increasingly important, owing to rapid advancements in autonomous vehicle technologies. Inverse reinforcement learning (IRL) has emerged as a leading approach domain, as it allows the inference of underlying reward functions from human-driving demonstrations, enabling the modeling of complex behaviors. Among various IRL approaches, maximum-entropy IRL (MaxEnt IRL) has gained prominence in driving-behavior modeling owing to its applicability to continuous state-space problems. In continuous state-space MaxEnt IRL, stochastic motion planners are commonly employed during training to approximate the partition function because integrating over the entire state-space is computationally infeasible. However, traditional motion planners often fail to efficiently explore the state-space surrounding human-demonstrated paths, leading to inaccurate approximations of the partition function. This study proposes a novel, stable MaxEnt IRL framework that integrates an IRL-aware motion planner incorporating a guided exploration and exploitation process to efficiently sample high-quality trajectories. The proposed approach leverages two distributions derived from human-demonstrated paths to balance broad state-space exploration and targeted exploitation of relevant regions. Experiments conducted in a driving simulator demonstrate that the proposed method outperforms existing IRL methods in stability and accuracy, enhancing the prediction of driving behaviors and showcasing the potential of IRL for achieving human-like decision-making in autonomous driving. |
| format | Article |
| id | doaj-art-ff1b0e33173443e2be7166ad58b7a3ee |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-ff1b0e33173443e2be7166ad58b7a3ee2025-08-20T01:53:04ZengIEEEIEEE Access2169-35362025-01-0113873138732610.1109/ACCESS.2025.357095711006073Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior PredictionMinglu Zhao0https://orcid.org/0009-0005-2788-1025Masamichi Shimosaka1https://orcid.org/0000-0003-0558-2006Department of Computer Science, Institute of Science Tokyo, Tokyo, JapanDepartment of Computer Science, Institute of Science Tokyo, Tokyo, JapanDriving behavior prediction has become increasingly important, owing to rapid advancements in autonomous vehicle technologies. Inverse reinforcement learning (IRL) has emerged as a leading approach domain, as it allows the inference of underlying reward functions from human-driving demonstrations, enabling the modeling of complex behaviors. Among various IRL approaches, maximum-entropy IRL (MaxEnt IRL) has gained prominence in driving-behavior modeling owing to its applicability to continuous state-space problems. In continuous state-space MaxEnt IRL, stochastic motion planners are commonly employed during training to approximate the partition function because integrating over the entire state-space is computationally infeasible. However, traditional motion planners often fail to efficiently explore the state-space surrounding human-demonstrated paths, leading to inaccurate approximations of the partition function. This study proposes a novel, stable MaxEnt IRL framework that integrates an IRL-aware motion planner incorporating a guided exploration and exploitation process to efficiently sample high-quality trajectories. The proposed approach leverages two distributions derived from human-demonstrated paths to balance broad state-space exploration and targeted exploitation of relevant regions. Experiments conducted in a driving simulator demonstrate that the proposed method outperforms existing IRL methods in stability and accuracy, enhancing the prediction of driving behaviors and showcasing the potential of IRL for achieving human-like decision-making in autonomous driving.https://ieeexplore.ieee.org/document/11006073/Inverse reinforcement learningguided motion planningdriving behavior prediction |
| spellingShingle | Minglu Zhao Masamichi Shimosaka Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction IEEE Access Inverse reinforcement learning guided motion planning driving behavior prediction |
| title | Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction |
| title_full | Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction |
| title_fullStr | Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction |
| title_full_unstemmed | Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction |
| title_short | Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction |
| title_sort | stable inverse reinforcement learning via leveraged guided motion planner for driving behavior prediction |
| topic | Inverse reinforcement learning guided motion planning driving behavior prediction |
| url | https://ieeexplore.ieee.org/document/11006073/ |
| work_keys_str_mv | AT mingluzhao stableinversereinforcementlearningvialeveragedguidedmotionplannerfordrivingbehaviorprediction AT masamichishimosaka stableinversereinforcementlearningvialeveragedguidedmotionplannerfordrivingbehaviorprediction |