Printed Persian Subword Recognition Using Wavelet Packet Descriptors
In this paper, we present a new approach to offline OCR (optical character recognition) for printed Persian subwords using wavelet packet transform. The proposed algorithm is used to extract font invariant and size invariant features from 87804 subwords of 4 fonts and 3 sizes. The feature vectors a...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2013-01-01
|
Series: | Journal of Engineering |
Online Access: | http://dx.doi.org/10.1155/2013/465469 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832565348960829440 |
---|---|
author | Samira Nasrollahi Afshin Ebrahimi |
author_facet | Samira Nasrollahi Afshin Ebrahimi |
author_sort | Samira Nasrollahi |
collection | DOAJ |
description | In this paper, we present a new approach to offline OCR (optical character recognition) for printed Persian
subwords using wavelet packet transform. The proposed algorithm is used to extract font invariant and size invariant features from 87804 subwords of 4 fonts and 3 sizes. The feature vectors are compressed using PCA. The obtained feature vectors yield a pictorial dictionary for which an entry is the mean of each group that consists of the same subword with 4 fonts in 3 sizes. The sets of these features are congregated by combining them with the dot features for the recognition of printed Persian subwords. To evaluate the feature extraction results, this algorithm was tested on a set of 2000 subwords in printed Persian text documents. An encouraging recognition rate of 97.9% is got at subword level recognition. |
format | Article |
id | doaj-art-02eebe484f164c859fcbf87781ca2c82 |
institution | Kabale University |
issn | 2314-4904 2314-4912 |
language | English |
publishDate | 2013-01-01 |
publisher | Wiley |
record_format | Article |
series | Journal of Engineering |
spelling | doaj-art-02eebe484f164c859fcbf87781ca2c822025-02-03T01:08:00ZengWileyJournal of Engineering2314-49042314-49122013-01-01201310.1155/2013/465469465469Printed Persian Subword Recognition Using Wavelet Packet DescriptorsSamira Nasrollahi0Afshin Ebrahimi1Faculty of Electrical Engineering, Sahand University of Technology, Tabriz, IranFaculty of Electrical Engineering, Sahand University of Technology, Tabriz, IranIn this paper, we present a new approach to offline OCR (optical character recognition) for printed Persian subwords using wavelet packet transform. The proposed algorithm is used to extract font invariant and size invariant features from 87804 subwords of 4 fonts and 3 sizes. The feature vectors are compressed using PCA. The obtained feature vectors yield a pictorial dictionary for which an entry is the mean of each group that consists of the same subword with 4 fonts in 3 sizes. The sets of these features are congregated by combining them with the dot features for the recognition of printed Persian subwords. To evaluate the feature extraction results, this algorithm was tested on a set of 2000 subwords in printed Persian text documents. An encouraging recognition rate of 97.9% is got at subword level recognition.http://dx.doi.org/10.1155/2013/465469 |
spellingShingle | Samira Nasrollahi Afshin Ebrahimi Printed Persian Subword Recognition Using Wavelet Packet Descriptors Journal of Engineering |
title | Printed Persian Subword Recognition Using Wavelet Packet Descriptors |
title_full | Printed Persian Subword Recognition Using Wavelet Packet Descriptors |
title_fullStr | Printed Persian Subword Recognition Using Wavelet Packet Descriptors |
title_full_unstemmed | Printed Persian Subword Recognition Using Wavelet Packet Descriptors |
title_short | Printed Persian Subword Recognition Using Wavelet Packet Descriptors |
title_sort | printed persian subword recognition using wavelet packet descriptors |
url | http://dx.doi.org/10.1155/2013/465469 |
work_keys_str_mv | AT samiranasrollahi printedpersiansubwordrecognitionusingwaveletpacketdescriptors AT afshinebrahimi printedpersiansubwordrecognitionusingwaveletpacketdescriptors |