VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS

Automatic information extraction from images reduces the cost, human interference, and timely processing. Converting printed book covers to readable text for later automation process would be useful for a wide range of users such as librarians, bookshop keepers, and individual users. In this paper,...

Full description

Saved in:
Bibliographic Details
Main Authors: Phan Thị Thanh Nga, Nguyễn Thị Huyền Trang, Nguyễn Văn Phúc, Thái Duy Quý, Võ Phương Bình
Format: Article
Language:English
Published: Dalat University 2017-06-01
Series:Tạp chí Khoa học Đại học Đà Lạt
Subjects:
Online Access:http://tckh.dlu.edu.vn/index.php/tckhdhdl/article/view/234
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832569596525150208
author Phan Thị Thanh Nga
Nguyễn Thị Huyền Trang
Nguyễn Văn Phúc
Thái Duy Quý
Võ Phương Bình
author_facet Phan Thị Thanh Nga
Nguyễn Thị Huyền Trang
Nguyễn Văn Phúc
Thái Duy Quý
Võ Phương Bình
author_sort Phan Thị Thanh Nga
collection DOAJ
description Automatic information extraction from images reduces the cost, human interference, and timely processing. Converting printed book covers to readable text for later automation process would be useful for a wide range of users such as librarians, bookshop keepers, and individual users. In this paper, we present a novel method for the Vietnamese text extraction from images of scanned book covers. The proposed system accepts the book covers snapshot, filters the input image for an enhancement of quality, locates the regions with text, then utilizes the optical character recognizer (OCR) to extract the text. The last step is to filter the extracted text in accompany with at dictionary to achieve the final text result. Carrying out the experiments with the proposed system using our dataset delivered encouraging experimental results.
format Article
id doaj-art-741e55c330cc4724924028a4902461a3
institution Kabale University
issn 0866-787X
0866-787X
language English
publishDate 2017-06-01
publisher Dalat University
record_format Article
series Tạp chí Khoa học Đại học Đà Lạt
spelling doaj-art-741e55c330cc4724924028a4902461a32025-02-02T20:09:39ZengDalat UniversityTạp chí Khoa học Đại học Đà Lạt0866-787X0866-787X2017-06-017214215210.37569/DalatUniversity.7.2.234(2017)129VIETNAMESE TEXT EXTRACTION FROM BOOK COVERSPhan Thị Thanh Nga0Nguyễn Thị Huyền Trang1Nguyễn Văn Phúc2Thái Duy Quý3Võ Phương Bình4Faculty of Information Technology, Dalat UniversityFaculty of Information Technology, Dalat UniversityDevsoft CompanyThe Research Management and International Cooperation Department, Dalat UniversityFaculty of Information Technology, Dalat UniversityAutomatic information extraction from images reduces the cost, human interference, and timely processing. Converting printed book covers to readable text for later automation process would be useful for a wide range of users such as librarians, bookshop keepers, and individual users. In this paper, we present a novel method for the Vietnamese text extraction from images of scanned book covers. The proposed system accepts the book covers snapshot, filters the input image for an enhancement of quality, locates the regions with text, then utilizes the optical character recognizer (OCR) to extract the text. The last step is to filter the extracted text in accompany with at dictionary to achieve the final text result. Carrying out the experiments with the proposed system using our dataset delivered encouraging experimental results.http://tckh.dlu.edu.vn/index.php/tckhdhdl/article/view/234book coverocr (optical character recognition)text information extractionvietnamese text detection.
spellingShingle Phan Thị Thanh Nga
Nguyễn Thị Huyền Trang
Nguyễn Văn Phúc
Thái Duy Quý
Võ Phương Bình
VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS
Tạp chí Khoa học Đại học Đà Lạt
book cover
ocr (optical character recognition)
text information extraction
vietnamese text detection.
title VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS
title_full VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS
title_fullStr VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS
title_full_unstemmed VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS
title_short VIETNAMESE TEXT EXTRACTION FROM BOOK COVERS
title_sort vietnamese text extraction from book covers
topic book cover
ocr (optical character recognition)
text information extraction
vietnamese text detection.
url http://tckh.dlu.edu.vn/index.php/tckhdhdl/article/view/234
work_keys_str_mv AT phanthithanhnga vietnamesetextextractionfrombookcovers
AT nguyenthihuyentrang vietnamesetextextractionfrombookcovers
AT nguyenvanphuc vietnamesetextextractionfrombookcovers
AT thaiduyquy vietnamesetextextractionfrombookcovers
AT vophuongbinh vietnamesetextextractionfrombookcovers