Keywords spotting in Russian handwritten documents based on strokes segmentation

The keywords spotting task in handwritten documents is as follows: a user enters text that needs to be searched for in a corpus of handwritten documents. This task can significantly simplify work with archived data. We propose a two-stage algorithm to solve this problem. The first stage involves cla...

Full description

Saved in:
Bibliographic Details
Main Authors: D. D. Feoktistov, L. M. Mestetskiy
Format: Article
Language:English
Published: Copernicus Publications 2024-12-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://isprs-archives.copernicus.org/articles/XLVIII-2-W5-2024/49/2024/isprs-archives-XLVIII-2-W5-2024-49-2024.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850115395904602112
author D. D. Feoktistov
L. M. Mestetskiy
L. M. Mestetskiy
author_facet D. D. Feoktistov
L. M. Mestetskiy
L. M. Mestetskiy
author_sort D. D. Feoktistov
collection DOAJ
description The keywords spotting task in handwritten documents is as follows: a user enters text that needs to be searched for in a corpus of handwritten documents. This task can significantly simplify work with archived data. We propose a two-stage algorithm to solve this problem. The first stage involves classifying the strokes, which are the main elements of handwriting. To do this, a measure of similarity based on a Fourier descriptor for elements of the stroke representation is proposed. The second level of the algorithm involves matching the query with the text. An algorithm based on optimal string alignment distance is used for this purpose. To demonstrate the results and adjust the parameters of algorithm we use images of works completed during ”Total Dictation” exam.
format Article
id doaj-art-173d21efbec947e38f65b36aeb983f78
institution OA Journals
issn 1682-1750
2194-9034
language English
publishDate 2024-12-01
publisher Copernicus Publications
record_format Article
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
spelling doaj-art-173d21efbec947e38f65b36aeb983f782025-08-20T02:36:35ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342024-12-01XLVIII-2-W5-2024495410.5194/isprs-archives-XLVIII-2-W5-2024-49-2024Keywords spotting in Russian handwritten documents based on strokes segmentationD. D. Feoktistov0L. M. Mestetskiy1L. M. Mestetskiy2Lomonosov Moscow State University, Faculty of Computational Mathematics and Cybernetics, 119991 Moscow, RussiaLomonosov Moscow State University, Faculty of Computational Mathematics and Cybernetics, 119991 Moscow, RussiaNational Research University Higher School of Economics (HSE University), 109028 Moscow, RussiaThe keywords spotting task in handwritten documents is as follows: a user enters text that needs to be searched for in a corpus of handwritten documents. This task can significantly simplify work with archived data. We propose a two-stage algorithm to solve this problem. The first stage involves classifying the strokes, which are the main elements of handwriting. To do this, a measure of similarity based on a Fourier descriptor for elements of the stroke representation is proposed. The second level of the algorithm involves matching the query with the text. An algorithm based on optimal string alignment distance is used for this purpose. To demonstrate the results and adjust the parameters of algorithm we use images of works completed during ”Total Dictation” exam.https://isprs-archives.copernicus.org/articles/XLVIII-2-W5-2024/49/2024/isprs-archives-XLVIII-2-W5-2024-49-2024.pdf
spellingShingle D. D. Feoktistov
L. M. Mestetskiy
L. M. Mestetskiy
Keywords spotting in Russian handwritten documents based on strokes segmentation
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
title Keywords spotting in Russian handwritten documents based on strokes segmentation
title_full Keywords spotting in Russian handwritten documents based on strokes segmentation
title_fullStr Keywords spotting in Russian handwritten documents based on strokes segmentation
title_full_unstemmed Keywords spotting in Russian handwritten documents based on strokes segmentation
title_short Keywords spotting in Russian handwritten documents based on strokes segmentation
title_sort keywords spotting in russian handwritten documents based on strokes segmentation
url https://isprs-archives.copernicus.org/articles/XLVIII-2-W5-2024/49/2024/isprs-archives-XLVIII-2-W5-2024-49-2024.pdf
work_keys_str_mv AT ddfeoktistov keywordsspottinginrussianhandwrittendocumentsbasedonstrokessegmentation
AT lmmestetskiy keywordsspottinginrussianhandwrittendocumentsbasedonstrokessegmentation
AT lmmestetskiy keywordsspottinginrussianhandwrittendocumentsbasedonstrokessegmentation