Combining Prior Knowledge and Reinforcement Learning for Parallel Telescopic-Legged Bipedal Robot Walking
The parallel dual-slider telescopic leg bipedal robot (L04) is characterized by its simple structure and low leg rotational inertia, which contribute to its walking efficiency. However, end-to-end methods often overlook the robot’s physical structure, leading to difficulties in maintaining the paral...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/6/979 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The parallel dual-slider telescopic leg bipedal robot (L04) is characterized by its simple structure and low leg rotational inertia, which contribute to its walking efficiency. However, end-to-end methods often overlook the robot’s physical structure, leading to difficulties in maintaining the parallel alignment of the dual sliders, which in turn compromises walking stability. One potential solution to this issue involves utilizing imitation learning to replicate human motion data. However, the dual telescopic leg structure of the L04 robot makes it difficult to perform motion retargeting of human motion data. To enable L04 walking, we design a method that integrates prior feedforward with reinforcement learning (PFRL), specifically tailored for the parallel dual-slider structure. We utilize prior knowledge as a feedforward action to compensate for system nonlinearities; meanwhile, the feedback action generated by the policy network adaptively regulates dynamic balance and, combined with the feedforward action, jointly controls the robot’s walking. PFRL enforces constraints within the motion space to mitigate the chaotic behavior of the parallel dual sliders. Experimental results show that our method successfully achieves sim2real transfer on a real bipedal robot without the need for domain randomization techniques or intricate reward functions. L04 achieves omnidirectional walking with minimal energy consumption and exhibits robustness against external disturbances. |
|---|---|
| ISSN: | 2227-7390 |