Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network

Deploying deep learning–based optical character recognition (OCR) systems for low-resource, complex-script languages like Urdu remains a major challenge due to high computational costs, lack of annotated datasets, and limited hardware support for real-time applications. Existing FPGA-base...

Full description

Saved in:
Bibliographic Details
Main Authors: Fauzia Yasir, Majida Kazmi
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11098840/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849391094098296832
author Fauzia Yasir
Majida Kazmi
author_facet Fauzia Yasir
Majida Kazmi
author_sort Fauzia Yasir
collection DOAJ
description Deploying deep learning–based optical character recognition (OCR) systems for low-resource, complex-script languages like Urdu remains a major challenge due to high computational costs, lack of annotated datasets, and limited hardware support for real-time applications. Existing FPGA-based OCR implementations have primarily focused on simplified datasets such as MNIST digits, limiting their generalizability to scripts like Urdu that exhibit extensive intra-class variability, contextual shaping, and diacritics. This study presents a hardware-accelerated Urdu OCR framework using a custom-designed Convolutional Neural Network (CNN) optimized for deployment on the Xilinx Zynq UltraScale+ MPSoC (ZCU104). The proposed CNN is trained on a novel large-scale dataset of 336,000 labeled images spanning 48 Urdu characters across 230 font styles. Compared to MNIST-based FPGA implementations, our approach addresses significantly higher script complexity while achieving a classification accuracy of 96.73% (FP32) and 94.06% (INT8). Hardware-aware quantization and deployment using the Vitis AI toolchain enabled 75% model compression with minimal accuracy loss, achieving real-time inference of 0.189 ms per character and 4,886.95 FPS, while consuming only 1.32 W. Benchmarking against CPU and GPU platforms confirmed substantial improvements in speed and energy efficiency. This work establishes a high-performance, scalable, and energy-efficient FPGA-based OCR framework for Urdu and sets the foundation for extending such solutions to other cursive, low-resource languages like Arabic, Pashto, and Persian.
format Article
id doaj-art-846f68e910064324b788e5e739cc09b3
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-846f68e910064324b788e5e739cc09b32025-08-20T03:41:11ZengIEEEIEEE Access2169-35362025-01-011313553813555710.1109/ACCESS.2025.359329411098840Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural NetworkFauzia Yasir0https://orcid.org/0009-0004-2921-9046Majida Kazmi1https://orcid.org/0000-0002-2767-3139Faculty of Electrical and Computer Engineering, NED University of Engineering and Technology, Karachi, PakistanFaculty of Electrical and Computer Engineering, NED University of Engineering and Technology, Karachi, PakistanDeploying deep learning–based optical character recognition (OCR) systems for low-resource, complex-script languages like Urdu remains a major challenge due to high computational costs, lack of annotated datasets, and limited hardware support for real-time applications. Existing FPGA-based OCR implementations have primarily focused on simplified datasets such as MNIST digits, limiting their generalizability to scripts like Urdu that exhibit extensive intra-class variability, contextual shaping, and diacritics. This study presents a hardware-accelerated Urdu OCR framework using a custom-designed Convolutional Neural Network (CNN) optimized for deployment on the Xilinx Zynq UltraScale+ MPSoC (ZCU104). The proposed CNN is trained on a novel large-scale dataset of 336,000 labeled images spanning 48 Urdu characters across 230 font styles. Compared to MNIST-based FPGA implementations, our approach addresses significantly higher script complexity while achieving a classification accuracy of 96.73% (FP32) and 94.06% (INT8). Hardware-aware quantization and deployment using the Vitis AI toolchain enabled 75% model compression with minimal accuracy loss, achieving real-time inference of 0.189 ms per character and 4,886.95 FPS, while consuming only 1.32 W. Benchmarking against CPU and GPU platforms confirmed substantial improvements in speed and energy efficiency. This work establishes a high-performance, scalable, and energy-efficient FPGA-based OCR framework for Urdu and sets the foundation for extending such solutions to other cursive, low-resource languages like Arabic, Pashto, and Persian.https://ieeexplore.ieee.org/document/11098840/Computer visionconvolution neural networkedge computingFPGAlow powerenergy efficiency
spellingShingle Fauzia Yasir
Majida Kazmi
Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network
IEEE Access
Computer vision
convolution neural network
edge computing
FPGA
low power
energy efficiency
title Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network
title_full Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network
title_fullStr Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network
title_full_unstemmed Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network
title_short Acceleration of Urdu Optical Character Recognition on Zynq UltraScale+ MPSoC Using Deep Convolutional Neural Network
title_sort acceleration of urdu optical character recognition on zynq ultrascale mpsoc using deep convolutional neural network
topic Computer vision
convolution neural network
edge computing
FPGA
low power
energy efficiency
url https://ieeexplore.ieee.org/document/11098840/
work_keys_str_mv AT fauziayasir accelerationofurduopticalcharacterrecognitiononzynqultrascalempsocusingdeepconvolutionalneuralnetwork
AT majidakazmi accelerationofurduopticalcharacterrecognitiononzynqultrascalempsocusingdeepconvolutionalneuralnetwork