Convolutional Swin Encoder

This paper focuses on developing a deep learning architecture capable of identifying writers' attributes from their handwriting. It introduces Convolutional Swin Encoder (CSE), a novel architecture combining Visual Geometry Group Network (VGGNet) and Swin Transformer blocks. CSE is designed to...

Full description

Saved in:
Bibliographic Details
Main Authors: Aditya Majithia, Arthur Paul Pedersen, Michael Grossberg
Format: Article
Language:English
Published: LibraryPress@UF 2025-05-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Subjects:
Online Access:https://journals.flvc.org/FLAIRS/article/view/138949
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850138045990305792
author Aditya Majithia
Arthur Paul Pedersen
Michael Grossberg
author_facet Aditya Majithia
Arthur Paul Pedersen
Michael Grossberg
author_sort Aditya Majithia
collection DOAJ
description This paper focuses on developing a deep learning architecture capable of identifying writers' attributes from their handwriting. It introduces Convolutional Swin Encoder (CSE), a novel architecture combining Visual Geometry Group Network (VGGNet) and Swin Transformer blocks. CSE is designed to handle multi-label classification using images of individual handwritten words. As a unified encoder, it can predict writers' attributes such as identity, gender, age, and handedness. Using a word-level segmentation approach, CSE achieves competitive performance compared to page-level methods, which typically rely on separate classifiers instead of a unified one.
format Article
id doaj-art-42b68ab9b87841db8424ae76b45e627c
institution OA Journals
issn 2334-0754
2334-0762
language English
publishDate 2025-05-01
publisher LibraryPress@UF
record_format Article
series Proceedings of the International Florida Artificial Intelligence Research Society Conference
spelling doaj-art-42b68ab9b87841db8424ae76b45e627c2025-08-20T02:30:39ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622025-05-0138110.32473/flairs.38.1.138949Convolutional Swin EncoderAditya Majithia0Arthur Paul Pedersen1https://orcid.org/0000-0002-2164-6404Michael Grossberg2City College of New YorkThe City University of New York (CUNY)The City University of New York (CUNY) This paper focuses on developing a deep learning architecture capable of identifying writers' attributes from their handwriting. It introduces Convolutional Swin Encoder (CSE), a novel architecture combining Visual Geometry Group Network (VGGNet) and Swin Transformer blocks. CSE is designed to handle multi-label classification using images of individual handwritten words. As a unified encoder, it can predict writers' attributes such as identity, gender, age, and handedness. Using a word-level segmentation approach, CSE achieves competitive performance compared to page-level methods, which typically rely on separate classifiers instead of a unified one. https://journals.flvc.org/FLAIRS/article/view/138949Authorship AttributionHandwriting AnalysisSwin Transformersmultiple task learning
spellingShingle Aditya Majithia
Arthur Paul Pedersen
Michael Grossberg
Convolutional Swin Encoder
Proceedings of the International Florida Artificial Intelligence Research Society Conference
Authorship Attribution
Handwriting Analysis
Swin Transformers
multiple task learning
title Convolutional Swin Encoder
title_full Convolutional Swin Encoder
title_fullStr Convolutional Swin Encoder
title_full_unstemmed Convolutional Swin Encoder
title_short Convolutional Swin Encoder
title_sort convolutional swin encoder
topic Authorship Attribution
Handwriting Analysis
Swin Transformers
multiple task learning
url https://journals.flvc.org/FLAIRS/article/view/138949
work_keys_str_mv AT adityamajithia convolutionalswinencoder
AT arthurpaulpedersen convolutionalswinencoder
AT michaelgrossberg convolutionalswinencoder