Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment

Lip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited su...

Full description

Saved in:
Bibliographic Details
Main Authors: Amanullah Baloch, Mushtaq Ali, Lal Hussain, Touseef Sadiq, Badr S. Alkahtani
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10845760/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823857047761846272
author Amanullah Baloch
Mushtaq Ali
Lal Hussain
Touseef Sadiq
Badr S. Alkahtani
author_facet Amanullah Baloch
Mushtaq Ali
Lal Hussain
Touseef Sadiq
Badr S. Alkahtani
author_sort Amanullah Baloch
collection DOAJ
description Lip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited success in developing an effective model for Urdu lip reading due to the lack of an appropriate dataset and the challenges faced by earlier models, such as the unsuccessful adaptation of the LipNet model to Urdu. To address these issues, we contribute by introducing the ULRD dataset, employing diverse data augmentation techniques, and comparing three DNN models: a Hybrid 2D-3D CNN-LSTM model, a LipNet-based 2D CNN-LSTM model, and a baseline 3D CNN-GRU model. Each model is evaluated in both controlled and uncontrolled environments, using both seen and unseen data. Results indicate that the LipNet-based 2D CNN-LSTM model achieves overall 92.15 % high accuracy in all conditions, but the Hybrid model demonstrates impressive generalization with an overall 90.00 % accuracy on unseen data due to its enhanced spatiotemporal feature extraction capability. Additionally, the precision: 0.91, recall: 0.91, and F1-Score: 0.91 results of LipNet-based 2D CNN-LSTM model are also high, then its other competitors models. The other various findings highlight the effectiveness of different DNN architectures and the potential improvements offered by the ULRD dataset for Urdu lip reading research.
format Article
id doaj-art-35d97a9819d148ceb0965f1f4921764b
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-35d97a9819d148ceb0965f1f4921764b2025-02-12T00:02:55ZengIEEEIEEE Access2169-35362025-01-0113249062492710.1109/ACCESS.2025.353164010845760Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled EnvironmentAmanullah Baloch0Mushtaq Ali1https://orcid.org/0000-0002-3697-9498Lal Hussain2https://orcid.org/0000-0003-1103-4938Touseef Sadiq3https://orcid.org/0000-0001-6603-3639Badr S. Alkahtani4Department of Computer Science and Information Technology, Hazara University Mansehra, Mansehra, PakistanDepartment of Computer Science and Information Technology, Hazara University Mansehra, Mansehra, PakistanDepartment of Computer Science and IT, Neelum Campus, The University of Azad Jammu and Kashmir, Athmuqam, PakistanDepartment of Information and Communication Technology, Centre for Artificial Intelligence Research (CAIR), University of Agder, Grimstad, NorwayDepartment of Mathematics, King Saud University, Riyadh, Saudi ArabiaLip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited success in developing an effective model for Urdu lip reading due to the lack of an appropriate dataset and the challenges faced by earlier models, such as the unsuccessful adaptation of the LipNet model to Urdu. To address these issues, we contribute by introducing the ULRD dataset, employing diverse data augmentation techniques, and comparing three DNN models: a Hybrid 2D-3D CNN-LSTM model, a LipNet-based 2D CNN-LSTM model, and a baseline 3D CNN-GRU model. Each model is evaluated in both controlled and uncontrolled environments, using both seen and unseen data. Results indicate that the LipNet-based 2D CNN-LSTM model achieves overall 92.15 % high accuracy in all conditions, but the Hybrid model demonstrates impressive generalization with an overall 90.00 % accuracy on unseen data due to its enhanced spatiotemporal feature extraction capability. Additionally, the precision: 0.91, recall: 0.91, and F1-Score: 0.91 results of LipNet-based 2D CNN-LSTM model are also high, then its other competitors models. The other various findings highlight the effectiveness of different DNN architectures and the potential improvements offered by the ULRD dataset for Urdu lip reading research.https://ieeexplore.ieee.org/document/10845760/Lip reading systemUrdu datasetdeep learning modeldata augmentationlips extraction
spellingShingle Amanullah Baloch
Mushtaq Ali
Lal Hussain
Touseef Sadiq
Badr S. Alkahtani
Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
IEEE Access
Lip reading system
Urdu dataset
deep learning model
data augmentation
lips extraction
title Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_full Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_fullStr Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_full_unstemmed Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_short Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
title_sort urdu lip reading systems for digits in controlled and uncontrolled environment
topic Lip reading system
Urdu dataset
deep learning model
data augmentation
lips extraction
url https://ieeexplore.ieee.org/document/10845760/
work_keys_str_mv AT amanullahbaloch urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment
AT mushtaqali urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment
AT lalhussain urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment
AT touseefsadiq urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment
AT badrsalkahtani urdulipreadingsystemsfordigitsincontrolledanduncontrolledenvironment