TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer

Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for tabl...

Full description

Saved in:
Bibliographic Details
Main Authors: Libo Ma, Yan Tong
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-10-01
Series:Frontiers in Neurorobotics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnbot.2024.1443177/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850275435720474624
author Libo Ma
Yan Tong
author_facet Libo Ma
Yan Tong
author_sort Libo Ma
collection DOAJ
description Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.
format Article
id doaj-art-83c3ab32351e48e8a8b436e06d76d93d
institution OA Journals
issn 1662-5218
language English
publishDate 2024-10-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Neurorobotics
spelling doaj-art-83c3ab32351e48e8a8b436e06d76d93d2025-08-20T01:50:45ZengFrontiers Media S.A.Frontiers in Neurorobotics1662-52182024-10-011810.3389/fnbot.2024.14431771443177TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-TransformerLibo Ma0Yan Tong1Guangdong Polytechnic of Environmental Protection Engineering, Foshan, ChinaHunan Labor and Human Resources Vocational College, Changsha, ChinaCurrently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.https://www.frontiersin.org/articles/10.3389/fnbot.2024.1443177/fullneural computingcomputer visionneurosciencemulti-modal robottable tennis stroke recognition
spellingShingle Libo Ma
Yan Tong
TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
Frontiers in Neurorobotics
neural computing
computer vision
neuroscience
multi-modal robot
table tennis stroke recognition
title TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
title_full TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
title_fullStr TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
title_full_unstemmed TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
title_short TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
title_sort tl cstrans net a vision robot for table tennis player action recognition driven via cs transformer
topic neural computing
computer vision
neuroscience
multi-modal robot
table tennis stroke recognition
url https://www.frontiersin.org/articles/10.3389/fnbot.2024.1443177/full
work_keys_str_mv AT liboma tlcstransnetavisionrobotfortabletennisplayeractionrecognitiondrivenviacstransformer
AT yantong tlcstransnetavisionrobotfortabletennisplayeractionrecognitiondrivenviacstransformer