Scene Text Detection and Recognition Using Maximally Stable Extremal Region

In recent years, scene text detection and recognition have become important research areas in computer vision and machine learning. Traditional text detection and recognition methods may struggle with detecting and recognizing text in images with low resolution, complex backgrounds, and varying fon...

Full description

Saved in:
Bibliographic Details
Main Authors: Golda Jeyasheeli P, Athinarayanan B, Manish T, Mohamad Umar M
Format: Article
Language:English
Published: Yayasan Pendidikan Riset dan Pengembangan Intelektual (YRPI) 2024-12-01
Series:Journal of Applied Engineering and Technological Science
Subjects:
Online Access:https://journal.yrpipku.com/index.php/jaets/article/view/5958
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850250961397743616
author Golda Jeyasheeli P
Athinarayanan B
Manish T
Mohamad Umar M
author_facet Golda Jeyasheeli P
Athinarayanan B
Manish T
Mohamad Umar M
author_sort Golda Jeyasheeli P
collection DOAJ
description In recent years, scene text detection and recognition have become important research areas in computer vision and machine learning. Traditional text detection and recognition methods may struggle with detecting and recognizing text in images with low resolution, complex backgrounds, and varying font sizes. The proposed methodology addresses these challenges by combining multiple algorithms and using deep learning techniques. In this paper, we propose a method for scene text detection based on Maximally Stable Extremal Regions (MSER) combined with Stroke Width Transform (SWT) and recognition using Convolutional Recurrent Neural Networks (CRNN). Our method consists of two stages: text detection and text recognition. To detect text, we use MSER and SWT to extract candidate text regions from the input and then, we eradicate non-text regions using image to image translation. Finally, to recognize text, CRNN is used to recognize the text present in the detected regions. Our CRNN architecture consists of convolutional and recurrent layers, which enable us to capture both spatial and temporal features of the text. The methodology is evaluated on various benchmark datasets and has obtained good results with accuracy of 96% when compared to existing methods.
format Article
id doaj-art-6424d948a08d47e2a0aced4bbce1f3ad
institution OA Journals
issn 2715-6087
2715-6079
language English
publishDate 2024-12-01
publisher Yayasan Pendidikan Riset dan Pengembangan Intelektual (YRPI)
record_format Article
series Journal of Applied Engineering and Technological Science
spelling doaj-art-6424d948a08d47e2a0aced4bbce1f3ad2025-08-20T01:58:03ZengYayasan Pendidikan Riset dan Pengembangan Intelektual (YRPI)Journal of Applied Engineering and Technological Science2715-60872715-60792024-12-016110.37385/jaets.v6i1.5958Scene Text Detection and Recognition Using Maximally Stable Extremal Region Golda Jeyasheeli P0Athinarayanan B1Manish T2Mohamad Umar M3Department of Computer Science Engineering, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India.Department of Computer Science Engineering, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India.Department of Computer Science Engineering, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India.Department of Computer Science Engineering, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India. In recent years, scene text detection and recognition have become important research areas in computer vision and machine learning. Traditional text detection and recognition methods may struggle with detecting and recognizing text in images with low resolution, complex backgrounds, and varying font sizes. The proposed methodology addresses these challenges by combining multiple algorithms and using deep learning techniques. In this paper, we propose a method for scene text detection based on Maximally Stable Extremal Regions (MSER) combined with Stroke Width Transform (SWT) and recognition using Convolutional Recurrent Neural Networks (CRNN). Our method consists of two stages: text detection and text recognition. To detect text, we use MSER and SWT to extract candidate text regions from the input and then, we eradicate non-text regions using image to image translation. Finally, to recognize text, CRNN is used to recognize the text present in the detected regions. Our CRNN architecture consists of convolutional and recurrent layers, which enable us to capture both spatial and temporal features of the text. The methodology is evaluated on various benchmark datasets and has obtained good results with accuracy of 96% when compared to existing methods. https://journal.yrpipku.com/index.php/jaets/article/view/5958MSERSWTText DetectionText RecognitionDeep LearningCRNN
spellingShingle Golda Jeyasheeli P
Athinarayanan B
Manish T
Mohamad Umar M
Scene Text Detection and Recognition Using Maximally Stable Extremal Region
Journal of Applied Engineering and Technological Science
MSER
SWT
Text Detection
Text Recognition
Deep Learning
CRNN
title Scene Text Detection and Recognition Using Maximally Stable Extremal Region
title_full Scene Text Detection and Recognition Using Maximally Stable Extremal Region
title_fullStr Scene Text Detection and Recognition Using Maximally Stable Extremal Region
title_full_unstemmed Scene Text Detection and Recognition Using Maximally Stable Extremal Region
title_short Scene Text Detection and Recognition Using Maximally Stable Extremal Region
title_sort scene text detection and recognition using maximally stable extremal region
topic MSER
SWT
Text Detection
Text Recognition
Deep Learning
CRNN
url https://journal.yrpipku.com/index.php/jaets/article/view/5958
work_keys_str_mv AT goldajeyasheelip scenetextdetectionandrecognitionusingmaximallystableextremalregion
AT athinarayananb scenetextdetectionandrecognitionusingmaximallystableextremalregion
AT manisht scenetextdetectionandrecognitionusingmaximallystableextremalregion
AT mohamadumarm scenetextdetectionandrecognitionusingmaximallystableextremalregion