Printed document layout analysis and optical character recognition system based on deep learning
Abstract This paper proposes a layout analysis and text recognition system for printed documents based on deep learning. Initially, scanned documents or image files are processed using a layout analysis algorithm based on YOLOv4 and YOLOv8 deep learning to identify the positions of titles, text para...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-07439-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849334713350619136 |
|---|---|
| author | Dong-Lin Li Shih-Kai Lee Yin-Ting Liu |
| author_facet | Dong-Lin Li Shih-Kai Lee Yin-Ting Liu |
| author_sort | Dong-Lin Li |
| collection | DOAJ |
| description | Abstract This paper proposes a layout analysis and text recognition system for printed documents based on deep learning. Initially, scanned documents or image files are processed using a layout analysis algorithm based on YOLOv4 and YOLOv8 deep learning to identify the positions of titles, text paragraphs, tables, and images within the document. Each of these categories undergoes specific character segmentation processing. Then, the content is recognized using a text recognition algorithm based on Convolutional Neural Networks (CNN). Finally, the recognized text is integrated and output in editable formats, such as JSON or Microsoft formats. Our proposed method enables convenient, fast, and highly accurate OCR processing on a local computer. |
| format | Article |
| id | doaj-art-abfed52ebf9c4ae6a2b0e64ab4341edd |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-abfed52ebf9c4ae6a2b0e64ab4341edd2025-08-20T03:45:30ZengNature PortfolioScientific Reports2045-23222025-07-0115111510.1038/s41598-025-07439-yPrinted document layout analysis and optical character recognition system based on deep learningDong-Lin Li0Shih-Kai Lee1Yin-Ting Liu2Department of electrical engineering, National Taiwan Ocean UniversityDepartment of electrical engineering, National Taiwan Ocean UniversityDepartment of electrical engineering, National Taiwan Ocean UniversityAbstract This paper proposes a layout analysis and text recognition system for printed documents based on deep learning. Initially, scanned documents or image files are processed using a layout analysis algorithm based on YOLOv4 and YOLOv8 deep learning to identify the positions of titles, text paragraphs, tables, and images within the document. Each of these categories undergoes specific character segmentation processing. Then, the content is recognized using a text recognition algorithm based on Convolutional Neural Networks (CNN). Finally, the recognized text is integrated and output in editable formats, such as JSON or Microsoft formats. Our proposed method enables convenient, fast, and highly accurate OCR processing on a local computer.https://doi.org/10.1038/s41598-025-07439-yOCRLayout analysisCNNYOLODeep learning |
| spellingShingle | Dong-Lin Li Shih-Kai Lee Yin-Ting Liu Printed document layout analysis and optical character recognition system based on deep learning Scientific Reports OCR Layout analysis CNN YOLO Deep learning |
| title | Printed document layout analysis and optical character recognition system based on deep learning |
| title_full | Printed document layout analysis and optical character recognition system based on deep learning |
| title_fullStr | Printed document layout analysis and optical character recognition system based on deep learning |
| title_full_unstemmed | Printed document layout analysis and optical character recognition system based on deep learning |
| title_short | Printed document layout analysis and optical character recognition system based on deep learning |
| title_sort | printed document layout analysis and optical character recognition system based on deep learning |
| topic | OCR Layout analysis CNN YOLO Deep learning |
| url | https://doi.org/10.1038/s41598-025-07439-y |
| work_keys_str_mv | AT donglinli printeddocumentlayoutanalysisandopticalcharacterrecognitionsystembasedondeeplearning AT shihkailee printeddocumentlayoutanalysisandopticalcharacterrecognitionsystembasedondeeplearning AT yintingliu printeddocumentlayoutanalysisandopticalcharacterrecognitionsystembasedondeeplearning |