Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation
Understanding the physical interactions of objects with environments is critical for multi-object robotic manipulation tasks. A predictive dynamics model can predict the future states of manipulated objects, which is used to plan plausible actions that enable the objects to achieve desired goal stat...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2021-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/9420758/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849762721314111488 |
|---|---|
| author | Jiayu Wang Chuxiong Hu Yunan Wang Yu Zhu |
| author_facet | Jiayu Wang Chuxiong Hu Yunan Wang Yu Zhu |
| author_sort | Jiayu Wang |
| collection | DOAJ |
| description | Understanding the physical interactions of objects with environments is critical for multi-object robotic manipulation tasks. A predictive dynamics model can predict the future states of manipulated objects, which is used to plan plausible actions that enable the objects to achieve desired goal states. However, most current approaches on dynamics learning from high-dimensional visual observations have limitations. These methods either rely on a large amount of real-world data or build a model with a fixed number of objects, which makes them difficult to generalize to unseen objects. This paper proposes a Deep Object-centric Interaction Network (DOIN) which encodes object-centric representations for multiple objects from raw RGB images and reasons about the future trajectory for each object in latent space. The proposed model is trained only on large amounts of random interaction data collected in simulation. The learned model combined with a model predictive control framework enables a robot to search action sequences that manipulate objects to the desired configurations. The proposed method is evaluated both in simulation and real-world experiments on multi-object pushing tasks. Extensive simulation experiments show that DOIN can achieve high prediction accuracy in different scenes with different numbers of objects and outperform state-of-the-art baselines in the manipulation tasks. Real-world experiments demonstrate that the model trained on simulated data can be transferred to the real robot and can successfully perform multi-object pushing tasks for previously-unseen objects with significant variations in shape and size. |
| format | Article |
| id | doaj-art-c4db079b6b1a4755819f77d66fc893b3 |
| institution | DOAJ |
| issn | 2169-3536 |
| language | English |
| publishDate | 2021-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-c4db079b6b1a4755819f77d66fc893b32025-08-20T03:05:39ZengIEEEIEEE Access2169-35362021-01-019682776828810.1109/ACCESS.2021.30771179420758Dynamics Learning With Object-Centric Interaction Networks for Robot ManipulationJiayu Wang0https://orcid.org/0000-0001-8367-258XChuxiong Hu1https://orcid.org/0000-0002-3504-3065Yunan Wang2https://orcid.org/0000-0002-9812-7709Yu Zhu3Department of Mechanical Engineering, State Key Laboratory of Tribology, Tsinghua University, Beijing, ChinaDepartment of Mechanical Engineering, State Key Laboratory of Tribology, Tsinghua University, Beijing, ChinaDepartment of Mechanical Engineering, State Key Laboratory of Tribology, Tsinghua University, Beijing, ChinaDepartment of Mechanical Engineering, State Key Laboratory of Tribology, Tsinghua University, Beijing, ChinaUnderstanding the physical interactions of objects with environments is critical for multi-object robotic manipulation tasks. A predictive dynamics model can predict the future states of manipulated objects, which is used to plan plausible actions that enable the objects to achieve desired goal states. However, most current approaches on dynamics learning from high-dimensional visual observations have limitations. These methods either rely on a large amount of real-world data or build a model with a fixed number of objects, which makes them difficult to generalize to unseen objects. This paper proposes a Deep Object-centric Interaction Network (DOIN) which encodes object-centric representations for multiple objects from raw RGB images and reasons about the future trajectory for each object in latent space. The proposed model is trained only on large amounts of random interaction data collected in simulation. The learned model combined with a model predictive control framework enables a robot to search action sequences that manipulate objects to the desired configurations. The proposed method is evaluated both in simulation and real-world experiments on multi-object pushing tasks. Extensive simulation experiments show that DOIN can achieve high prediction accuracy in different scenes with different numbers of objects and outperform state-of-the-art baselines in the manipulation tasks. Real-world experiments demonstrate that the model trained on simulated data can be transferred to the real robot and can successfully perform multi-object pushing tasks for previously-unseen objects with significant variations in shape and size.https://ieeexplore.ieee.org/document/9420758/Deep learning in robotic manipulationmodel learningrepresentation learningvisual learning |
| spellingShingle | Jiayu Wang Chuxiong Hu Yunan Wang Yu Zhu Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation IEEE Access Deep learning in robotic manipulation model learning representation learning visual learning |
| title | Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation |
| title_full | Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation |
| title_fullStr | Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation |
| title_full_unstemmed | Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation |
| title_short | Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation |
| title_sort | dynamics learning with object centric interaction networks for robot manipulation |
| topic | Deep learning in robotic manipulation model learning representation learning visual learning |
| url | https://ieeexplore.ieee.org/document/9420758/ |
| work_keys_str_mv | AT jiayuwang dynamicslearningwithobjectcentricinteractionnetworksforrobotmanipulation AT chuxionghu dynamicslearningwithobjectcentricinteractionnetworksforrobotmanipulation AT yunanwang dynamicslearningwithobjectcentricinteractionnetworksforrobotmanipulation AT yuzhu dynamicslearningwithobjectcentricinteractionnetworksforrobotmanipulation |