TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer
Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for tabl...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2024-10-01
|
| Series: | Frontiers in Neurorobotics |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fnbot.2024.1443177/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850275435720474624 |
|---|---|
| author | Libo Ma Yan Tong |
| author_facet | Libo Ma Yan Tong |
| author_sort | Libo Ma |
| collection | DOAJ |
| description | Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience. |
| format | Article |
| id | doaj-art-83c3ab32351e48e8a8b436e06d76d93d |
| institution | OA Journals |
| issn | 1662-5218 |
| language | English |
| publishDate | 2024-10-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Neurorobotics |
| spelling | doaj-art-83c3ab32351e48e8a8b436e06d76d93d2025-08-20T01:50:45ZengFrontiers Media S.A.Frontiers in Neurorobotics1662-52182024-10-011810.3389/fnbot.2024.14431771443177TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-TransformerLibo Ma0Yan Tong1Guangdong Polytechnic of Environmental Protection Engineering, Foshan, ChinaHunan Labor and Human Resources Vocational College, Changsha, ChinaCurrently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.https://www.frontiersin.org/articles/10.3389/fnbot.2024.1443177/fullneural computingcomputer visionneurosciencemulti-modal robottable tennis stroke recognition |
| spellingShingle | Libo Ma Yan Tong TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer Frontiers in Neurorobotics neural computing computer vision neuroscience multi-modal robot table tennis stroke recognition |
| title | TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer |
| title_full | TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer |
| title_fullStr | TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer |
| title_full_unstemmed | TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer |
| title_short | TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer |
| title_sort | tl cstrans net a vision robot for table tennis player action recognition driven via cs transformer |
| topic | neural computing computer vision neuroscience multi-modal robot table tennis stroke recognition |
| url | https://www.frontiersin.org/articles/10.3389/fnbot.2024.1443177/full |
| work_keys_str_mv | AT liboma tlcstransnetavisionrobotfortabletennisplayeractionrecognitiondrivenviacstransformer AT yantong tlcstransnetavisionrobotfortabletennisplayeractionrecognitiondrivenviacstransformer |