Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment

Lip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited su...

Full description

Saved in:

Bibliographic Details
Main Authors:	Amanullah Baloch, Mushtaq Ali, Lal Hussain, Touseef Sadiq, Badr S. Alkahtani
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Lip reading system Urdu dataset deep learning model data augmentation lips extraction
Online Access:	https://ieeexplore.ieee.org/document/10845760/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1823857047761846272
author	Amanullah Baloch Mushtaq Ali Lal Hussain Touseef Sadiq Badr S. Alkahtani
author_facet	Amanullah Baloch Mushtaq Ali Lal Hussain Touseef Sadiq Badr S. Alkahtani
author_sort	Amanullah Baloch
collection	DOAJ
description	Lip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited success in developing an effective model for Urdu lip reading due to the lack of an appropriate dataset and the challenges faced by earlier models, such as the unsuccessful adaptation of the LipNet model to Urdu. To address these issues, we contribute by introducing the ULRD dataset, employing diverse data augmentation techniques, and comparing three DNN models: a Hybrid 2D-3D CNN-LSTM model, a LipNet-based 2D CNN-LSTM model, and a baseline 3D CNN-GRU model. Each model is evaluated in both controlled and uncontrolled environments, using both seen and unseen data. Results indicate that the LipNet-based 2D CNN-LSTM model achieves overall 92.15 % high accuracy in all conditions, but the Hybrid model demonstrates impressive generalization with an overall 90.00 % accuracy on unseen data due to its enhanced spatiotemporal feature extraction capability. Additionally, the precision: 0.91, recall: 0.91, and F1-Score: 0.91 results of LipNet-based 2D CNN-LSTM model are also high, then its other competitors models. The other various findings highlight the effectiveness of different DNN architectures and the potential improvements offered by the ULRD dataset for Urdu lip reading research.
format	Article
id	doaj-art-35d97a9819d148ceb0965f1f4921764b
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-35d97a9819d148ceb0965f1f4921764b2025-02-12T00:02:55ZengIEEEIEEE Access2169-35362025-01-0113249062492710.1109/ACCESS.2025.353164010845760Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled EnvironmentAmanullah Baloch0Mushtaq Ali1https://orcid.org/0000-0002-3697-9498Lal Hussain2https://orcid.org/0000-0003-1103-4938Touseef Sadiq3https://orcid.org/0000-0001-6603-3639Badr S. Alkahtani4Department of Computer Science and Information Technology, Hazara University Mansehra, Mansehra, PakistanDepartment of Computer Science and Information Technology, Hazara University Mansehra, Mansehra, PakistanDepartment of Computer Science and IT, Neelum Campus, The University of Azad Jammu and Kashmir, Athmuqam, PakistanDepartment of Information and Communication Technology, Centre for Artificial Intelligence Research (CAIR), University of Agder, Grimstad, NorwayDepartment of Mathematics, King Saud University, Riyadh, Saudi ArabiaLip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited success in developing an effective model for Urdu lip reading due to the lack of an appropriate dataset and the challenges faced by earlier models, such as the unsuccessful adaptation of the LipNet model to Urdu. To address these issues, we contribute by introducing the ULRD dataset, employing diverse data augmentation techniques, and comparing three DNN models: a Hybrid 2D-3D CNN-LSTM model, a LipNet-based 2D CNN-LSTM model, and a baseline 3D CNN-GRU model. Each model is evaluated in both controlled and uncontrolled environments, using both seen and unseen data. Results indicate that the LipNet-based 2D CNN-LSTM model achieves overall 92.15 % high accuracy in all conditions, but the Hybrid model demonstrates impressive generalization with an overall 90.00 % accuracy on unseen data due to its enhanced spatiotemporal feature extraction capability. Additionally, the precision: 0.91, recall: 0.91, and F1-Score: 0.91 results of LipNet-based 2D CNN-LSTM model are also high, then its other competitors models. The other various findings highlight the effectiveness of different DNN architectures and the potential improvements offered by the ULRD dataset for Urdu lip reading research.https://ieeexplore.ieee.org/document/10845760/Lip reading systemUrdu datasetdeep learning modeldata augmentationlips extraction
spellingShingle	Amanullah Baloch Mushtaq Ali Lal Hussain Touseef Sadiq Badr S. Alkahtani Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment IEEE Access Lip reading system Urdu dataset deep learning model data augmentation lips extraction
title	Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_full	Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_fullStr	Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_full_unstemmed	Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_short	Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_sort	urdu lip reading systems for digits in controlled and uncontrolled environment
topic	Lip reading system Urdu dataset deep learning model data augmentation lips extraction
url	https://ieeexplore.ieee.org/document/10845760/
work_keys_str_mv	AT amanullahbaloch urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment AT mushtaqali urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment AT lalhussain urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment AT touseefsadiq urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment AT badrsalkahtani urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment

Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment

Similar Items