A multi-task deep reinforcement learning framework based on curriculum learning and policy distillation for quadruped robot motor skill training

Deep reinforcement learning (RL) approaches are increasingly prominent in the field of robotics due to their adaptive decision-making capability. However, developing a single RL agent capable of performing multiple continuous control tasks for quadruped robots remains challenging. In this paper, a m...

Full description

Saved in:
Bibliographic Details
Main Authors: Liang Chen, Bo Shen, Jiale Hong
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Systems Science & Control Engineering
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/21642583.2025.2498914
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deep reinforcement learning (RL) approaches are increasingly prominent in the field of robotics due to their adaptive decision-making capability. However, developing a single RL agent capable of performing multiple continuous control tasks for quadruped robots remains challenging. In this paper, a multi-task deep RL framework based on curriculum learning and policy distillation is proposed, which aims to enhance the quadruped robot's motor performance across multiple continuous tasks by leveraging knowledge from expert skill teachers. The main novelties of the framework lie in the self-optimizing terrain curriculum learning strategy and the improved distillation loss function. The proposed self-optimizing terrain curriculum learning strategy for quadrupedal robots is designed to utilize Bayesian optimization to predict potential training terrains, thus effectively identifying the most suitable training curriculum. Additionally, the improved distillation loss function for RL weight optimization is proposed to enhance the transferability of the trained policy across diverse tasks. To validate the effectiveness of the proposed multi-task deep RL framework, the performance of the policy generated by the framework across diverse terrains is assessed. The experimental results demonstrate that the proposed multi-task deep RL framework could generate a unified policy that achieves excellent performance across multiple continuous control tasks for quadruped robots.
ISSN:2164-2583