Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach

Communication barriers between hard-of-hearing and hearing individuals can be mitigated through advancements in sign language recognition (SLR) systems. These SLR systems can also improve the user experience of hard-of-hearing people when interacting with conversational systems that could emerge in...

Full description

Saved in:
Bibliographic Details
Main Authors: Pawel Antonowicz, David Kasperek, Michal Podpora
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10981774/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850136783109488640
author Pawel Antonowicz
David Kasperek
Michal Podpora
author_facet Pawel Antonowicz
David Kasperek
Michal Podpora
author_sort Pawel Antonowicz
collection DOAJ
description Communication barriers between hard-of-hearing and hearing individuals can be mitigated through advancements in sign language recognition (SLR) systems. These SLR systems can also improve the user experience of hard-of-hearing people when interacting with conversational systems that could emerge in the near future. This work explores a landmark-based approach for word classification within an SLR system. The study investigates the impact of a novel data-cleaning methodology on model performance during training. Specifically, a data cleaning process focused on video trimming and sign placement correction is shown to significantly improve dataset quality, resulting in more accurate classification. This cleaner data not only facilitated a more stable training process for the RNN model but also effectively delayed the onset of overfitting compared to a model trained on the original data. The findings highlight the critical role of data quality, particularly when dealing with the limitations inherent to small datasets commonly encountered in SLR tasks. The contribution of this study lies in demonstrating how targeted data cleaning enhances model stability and performance in resource-limited SLR systems.
format Article
id doaj-art-19e8175ebff6453cbc93814347e7aa46
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-19e8175ebff6453cbc93814347e7aa462025-08-20T02:31:02ZengIEEEIEEE Access2169-35362025-01-0113818778188810.1109/ACCESS.2025.356633810981774Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based ApproachPawel Antonowicz0https://orcid.org/0000-0002-7405-8745David Kasperek1https://orcid.org/0000-0001-5659-0933Michal Podpora2https://orcid.org/0000-0002-1080-6767Department of Computer Science, Opole University of Technology, Opole, PolandDepartment of Computer Science, Opole University of Technology, Opole, PolandInstitute of Computer Science, University of Opole, Opole, PolandCommunication barriers between hard-of-hearing and hearing individuals can be mitigated through advancements in sign language recognition (SLR) systems. These SLR systems can also improve the user experience of hard-of-hearing people when interacting with conversational systems that could emerge in the near future. This work explores a landmark-based approach for word classification within an SLR system. The study investigates the impact of a novel data-cleaning methodology on model performance during training. Specifically, a data cleaning process focused on video trimming and sign placement correction is shown to significantly improve dataset quality, resulting in more accurate classification. This cleaner data not only facilitated a more stable training process for the RNN model but also effectively delayed the onset of overfitting compared to a model trained on the original data. The findings highlight the critical role of data quality, particularly when dealing with the limitations inherent to small datasets commonly encountered in SLR tasks. The contribution of this study lies in demonstrating how targeted data cleaning enhances model stability and performance in resource-limited SLR systems.https://ieeexplore.ieee.org/document/10981774/Sign language recognitionconversational systemdeep learningrecurrent neural networksLSTMdataset cleaning
spellingShingle Pawel Antonowicz
David Kasperek
Michal Podpora
Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach
IEEE Access
Sign language recognition
conversational system
deep learning
recurrent neural networks
LSTM
dataset cleaning
title Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach
title_full Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach
title_fullStr Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach
title_full_unstemmed Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach
title_short Sign Language Recognition—Dataset Cleaning for Robust Word Classification in a Landmark-Based Approach
title_sort sign language recognition x2014 dataset cleaning for robust word classification in a landmark based approach
topic Sign language recognition
conversational system
deep learning
recurrent neural networks
LSTM
dataset cleaning
url https://ieeexplore.ieee.org/document/10981774/
work_keys_str_mv AT pawelantonowicz signlanguagerecognitionx2014datasetcleaningforrobustwordclassificationinalandmarkbasedapproach
AT davidkasperek signlanguagerecognitionx2014datasetcleaningforrobustwordclassificationinalandmarkbasedapproach
AT michalpodpora signlanguagerecognitionx2014datasetcleaningforrobustwordclassificationinalandmarkbasedapproach