Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications

Human-Robot Interaction (HRI) in robots finds wide applications today such as personal assistant robots, autonomous vehicles, healthcare support robots, industrial robots and many more, which receive interpreted commands to perform functions ranging from home automation to real-time assembly line op...

Full description

Saved in:
Bibliographic Details
Main Authors: Sheeba Uruj, Riddhi Goswami, Sujala D. Shetty, Kalaichelvi Venkatesan, Karthikeyan Ramanujam
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11084769/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849417157470846976
author Sheeba Uruj
Riddhi Goswami
Sujala D. Shetty
Kalaichelvi Venkatesan
Karthikeyan Ramanujam
author_facet Sheeba Uruj
Riddhi Goswami
Sujala D. Shetty
Kalaichelvi Venkatesan
Karthikeyan Ramanujam
author_sort Sheeba Uruj
collection DOAJ
description Human-Robot Interaction (HRI) in robots finds wide applications today such as personal assistant robots, autonomous vehicles, healthcare support robots, industrial robots and many more, which receive interpreted commands to perform functions ranging from home automation to real-time assembly line operations. This paper provides a comparative study of different Natural Language Processing (NLP) models that are created by combining advanced Large Language Models (LLMs) with speech processing technologies to create a more intuitive, adaptable, and accurate system for robots. By assessing performance metrics including accuracy, response time and robustness, this paper identifies the best generalized model for the real-world application in robotics. In fact, this framework enables natural language understanding and speech generation to be combined effectively, which can help robots respond quickly to spoken requests, even in dynamic environments.
format Article
id doaj-art-69dff1f44a894e0489a2aedf673831e3
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-69dff1f44a894e0489a2aedf673831e32025-08-20T03:32:55ZengIEEEIEEE Access2169-35362025-01-011312717012718210.1109/ACCESS.2025.359059211084769Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World ApplicationsSheeba Uruj0https://orcid.org/0009-0009-9933-0133Riddhi Goswami1https://orcid.org/0009-0009-8244-5858Sujala D. Shetty2Kalaichelvi Venkatesan3https://orcid.org/0000-0002-9144-6846Karthikeyan Ramanujam4https://orcid.org/0000-0001-7401-7698Department of Computer Science, Birla Institute of Technology and Science, Pilani, Dubai Campus, Dubai International Academic City, Dubai, United Arab EmiratesDepartment of Computer Science, Birla Institute of Technology and Science, Pilani, Dubai Campus, Dubai International Academic City, Dubai, United Arab EmiratesDepartment of Computer Science, Birla Institute of Technology and Science, Pilani, Dubai Campus, Dubai International Academic City, Dubai, United Arab EmiratesDepartment of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani, Dubai Campus, Dubai International Academic City, Dubai, United Arab EmiratesDepartment of Mechanical Engineering, Birla Institute of Technology and Science, Pilani, Dubai Campus, Dubai International Academic City, Dubai, United Arab EmiratesHuman-Robot Interaction (HRI) in robots finds wide applications today such as personal assistant robots, autonomous vehicles, healthcare support robots, industrial robots and many more, which receive interpreted commands to perform functions ranging from home automation to real-time assembly line operations. This paper provides a comparative study of different Natural Language Processing (NLP) models that are created by combining advanced Large Language Models (LLMs) with speech processing technologies to create a more intuitive, adaptable, and accurate system for robots. By assessing performance metrics including accuracy, response time and robustness, this paper identifies the best generalized model for the real-world application in robotics. In fact, this framework enables natural language understanding and speech generation to be combined effectively, which can help robots respond quickly to spoken requests, even in dynamic environments.https://ieeexplore.ieee.org/document/11084769/HRIrobot operating system (ROS)NLPLLMsspeech-to-text (STT)text-to-speech (TTS)
spellingShingle Sheeba Uruj
Riddhi Goswami
Sujala D. Shetty
Kalaichelvi Venkatesan
Karthikeyan Ramanujam
Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications
IEEE Access
HRI
robot operating system (ROS)
NLP
LLMs
speech-to-text (STT)
text-to-speech (TTS)
title Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications
title_full Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications
title_fullStr Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications
title_full_unstemmed Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications
title_short Comparative Analysis of GPT-4 and LLaMA 3.2 Integration With Speech Processing Models for Enhancing Human–Robot Interaction and Motion Control in Real-World Applications
title_sort comparative analysis of gpt 4 and llama 3 2 integration with speech processing models for enhancing human x2013 robot interaction and motion control in real world applications
topic HRI
robot operating system (ROS)
NLP
LLMs
speech-to-text (STT)
text-to-speech (TTS)
url https://ieeexplore.ieee.org/document/11084769/
work_keys_str_mv AT sheebauruj comparativeanalysisofgpt4andllama32integrationwithspeechprocessingmodelsforenhancinghumanx2013robotinteractionandmotioncontrolinrealworldapplications
AT riddhigoswami comparativeanalysisofgpt4andllama32integrationwithspeechprocessingmodelsforenhancinghumanx2013robotinteractionandmotioncontrolinrealworldapplications
AT sujaladshetty comparativeanalysisofgpt4andllama32integrationwithspeechprocessingmodelsforenhancinghumanx2013robotinteractionandmotioncontrolinrealworldapplications
AT kalaichelvivenkatesan comparativeanalysisofgpt4andllama32integrationwithspeechprocessingmodelsforenhancinghumanx2013robotinteractionandmotioncontrolinrealworldapplications
AT karthikeyanramanujam comparativeanalysisofgpt4andllama32integrationwithspeechprocessingmodelsforenhancinghumanx2013robotinteractionandmotioncontrolinrealworldapplications