Sim-to-Real Reinforcement Learning for a Rotary Double-Inverted Pendulum Based on a Mathematical Model
This paper proposes a transition control strategy for a rotary double-inverted pendulum (RDIP) system using a sim-to-real reinforcement learning (RL) controller, built upon mathematical modeling and parameter estimation. High-resolution sensor data are used to estimate key physical parameters, ensur...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/12/1996 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This paper proposes a transition control strategy for a rotary double-inverted pendulum (RDIP) system using a sim-to-real reinforcement learning (RL) controller, built upon mathematical modeling and parameter estimation. High-resolution sensor data are used to estimate key physical parameters, ensuring model fidelity for simulation. The resulting mathematical model serves as the training environment in which the RL agent learns to perform transitions between various initial conditions and target equilibrium configurations. The training process adopts the Truncated Quantile Critics (TQC) algorithm, with a reward function specifically designed to reflect the nonlinear characteristics of the system. The learned policy is directly deployed on physical hardware without additional tuning or calibration, and the TQC-based controller successfully achieves all four equilibrium transitions. Furthermore, the controller exhibits robust recovery properties under external disturbances, demonstrating its effectiveness as a reliable sim-to-real control approach for high-dimensional nonlinear systems. |
|---|---|
| ISSN: | 2227-7390 |