Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network

Communication through speech can be hindered by environmental noise, prompting the need for alternative methods such as lip reading, which bypasses auditory challenges. However, the accurate interpretation of lip movements is impeded by the uniqueness of individual lip shapes, necessitating detailed...

Full description

Saved in:

Bibliographic Details
Main Authors:	null Aripin, Abas Setiawan
Format:	Article
Language:	English
Published:	Wiley 2024-01-01
Series:	Applied Computational Intelligence and Soft Computing
Online Access:	http://dx.doi.org/10.1155/2024/6479124
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849413881157386240
author	null Aripin Abas Setiawan
author_facet	null Aripin Abas Setiawan
author_sort	null Aripin
collection	DOAJ
description	Communication through speech can be hindered by environmental noise, prompting the need for alternative methods such as lip reading, which bypasses auditory challenges. However, the accurate interpretation of lip movements is impeded by the uniqueness of individual lip shapes, necessitating detailed analysis. In addition, the development of an Indonesian dataset addresses the lack of diversity in existing datasets, predominantly in English, fostering more inclusive research. This study proposes an enhanced lip-reading system trained using the long-term recurrent convolutional network (LRCN) considering eight different types of lip shapes. MediaPipe Face Mesh precisely detects lip landmarks, enabling the LRCN model to recognize Indonesian utterances. Experimental results demonstrate the effectiveness of the approach, with the LRCN model with three convolutional layers (LRCN-3Conv) achieving 95.42% accuracy for word test data and 95.63% for phrases, outperforming the convolutional long short-term memory (Conv-LSTM) method. The proposed approach outperforms Conv-LSTM in terms of accuracy. Furthermore, the evaluation of the original MIRACL-VC1 dataset also produced a best accuracy of 90.67% on LRCN-3Conv compared to previous studies in the word-labeled class. The success is attributed to MediaPipe Face Mesh detection, which facilitates the accurate detection of the lip region. Leveraging advanced deep learning techniques and precise landmark detection, these findings promise improved communication accessibility for individuals facing auditory challenges.
format	Article
id	doaj-art-700b0f006fac4137897f123181f8dfaf
institution	Kabale University
issn	1687-9732
language	English
publishDate	2024-01-01
publisher	Wiley
record_format	Article
series	Applied Computational Intelligence and Soft Computing
spelling	doaj-art-700b0f006fac4137897f123181f8dfaf2025-08-20T03:34:00ZengWileyApplied Computational Intelligence and Soft Computing1687-97322024-01-01202410.1155/2024/6479124Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Networknull Aripin0Abas Setiawan1Department of Biomedical EngineeringDepartment of Computer ScienceCommunication through speech can be hindered by environmental noise, prompting the need for alternative methods such as lip reading, which bypasses auditory challenges. However, the accurate interpretation of lip movements is impeded by the uniqueness of individual lip shapes, necessitating detailed analysis. In addition, the development of an Indonesian dataset addresses the lack of diversity in existing datasets, predominantly in English, fostering more inclusive research. This study proposes an enhanced lip-reading system trained using the long-term recurrent convolutional network (LRCN) considering eight different types of lip shapes. MediaPipe Face Mesh precisely detects lip landmarks, enabling the LRCN model to recognize Indonesian utterances. Experimental results demonstrate the effectiveness of the approach, with the LRCN model with three convolutional layers (LRCN-3Conv) achieving 95.42% accuracy for word test data and 95.63% for phrases, outperforming the convolutional long short-term memory (Conv-LSTM) method. The proposed approach outperforms Conv-LSTM in terms of accuracy. Furthermore, the evaluation of the original MIRACL-VC1 dataset also produced a best accuracy of 90.67% on LRCN-3Conv compared to previous studies in the word-labeled class. The success is attributed to MediaPipe Face Mesh detection, which facilitates the accurate detection of the lip region. Leveraging advanced deep learning techniques and precise landmark detection, these findings promise improved communication accessibility for individuals facing auditory challenges.http://dx.doi.org/10.1155/2024/6479124
spellingShingle	null Aripin Abas Setiawan Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network Applied Computational Intelligence and Soft Computing
title	Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network
title_full	Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network
title_fullStr	Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network
title_full_unstemmed	Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network
title_short	Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network
title_sort	indonesian lip reading detection and recognition based on lip shape using face mesh and long term recurrent convolutional network
url	http://dx.doi.org/10.1155/2024/6479124
work_keys_str_mv	AT nullaripin indonesianlipreadingdetectionandrecognitionbasedonlipshapeusingfacemeshandlongtermrecurrentconvolutionalnetwork AT abassetiawan indonesianlipreadingdetectionandrecognitionbasedonlipshapeusingfacemeshandlongtermrecurrentconvolutionalnetwork

Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network

Similar Items