Large language model-driven natural language interaction control framework for single-operator bimanual teleoperation

Bimanual teleoperation imposes cognitive and coordination demands on a single human operator tasked with simultaneously controlling two robotic arms. Although assigning each arm to a separate operator can distribute workload, it often leads to ambiguities in decision authority and degrades overall e...

Full description

Saved in:
Bibliographic Details
Main Authors: Haolin Fei, Tao Xue, Yiyang He, Sheng Lin, Guanglong Du, Yao Guo, Ziwei Wang
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-07-01
Series:Frontiers in Robotics and AI
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frobt.2025.1621033/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Bimanual teleoperation imposes cognitive and coordination demands on a single human operator tasked with simultaneously controlling two robotic arms. Although assigning each arm to a separate operator can distribute workload, it often leads to ambiguities in decision authority and degrades overall efficiency. To overcome these challenges, we propose a novel bimanual teleoperation large language model assistant (BTLA) framework, an intelligent co-pilot that augments a single operator’s motor control capabilities. In particular, BTLA enables operators to directly control one robotic arm through conventional teleoperation while directing a second assistive arm via simple voice commands, and therefore commanding two robotic arms simultaneously. By integrating the GPT-3.5-turbo model, BTLA interprets contextual voice instructions and autonomously selects among six predefined manipulation skills, including real-time mirroring, trajectory following, and autonomous object grasping. Experimental evaluations in bimanual object manipulation tasks demonstrate that BTLA increased task coverage by 76.1% and success rate by 240.8% relative to solo teleoperation, and outperformed dyadic control with a 19.4% gain in coverage and a 69.9% gain in success. Furthermore, NASA Task Load Index (NASA-TLX) assessments revealed a 38–52% reduction in operator mental workload, and 85% of participants rated the voice-based interaction as “natural” and “highly effective.”
ISSN:2296-9144