ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification

Microscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these const...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhangda Liu, Panpan Wu, Ziping Zhao, Hengyong Yu
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/11/7/219
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849406519327588352
author Zhangda Liu
Panpan Wu
Ziping Zhao
Hengyong Yu
author_facet Zhangda Liu
Panpan Wu
Ziping Zhao
Hengyong Yu
author_sort Zhangda Liu
collection DOAJ
description Microscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these constraints, we propose an efficient classification method that balances strong feature representation with a lightweight design. Specifically, an Inception-Linear Attention-based Lightweight Vision Transformer (ILViT) model is developed for microscopic cell classification. The ILViT integrates two innovative modules: Dynamic Inception Convolution (DIC) and Contrastive Omni-Kolmogorov Attention (COKA). DIC combines dynamic and Inception-style convolutions to replace large kernels with fewer parameters. COKA integrates Omni-Dimensional Dynamic Convolution (ODC), linear attention, and a Kolmogorov-Arnold Network(KAN) structure to enhance feature learning and model interpretability. With only 1.91 GFLOPs and 8.98 million parameters, ILViT achieves high efficiency. Extensive experiments on four public datasets are conducted to validate the effectiveness of the proposed method. It achieves an accuracy of 97.185% on BioMediTech dataset for classifying retinal pigment epithelial cells, 97.436% on ICPR-HEp-2 dataset for diagnosing autoimmune disorders via HEp-2 cell classification, 90.528% on Hematological Malignancy Bone Marrow Cytology Expert Annotation dataset for categorizing bone marrow cells, and 99.758% on a white blood cell dataset for distinguishing leukocyte subtypes. These results show that ILViT outperforms the state-of-the-art models in both accuracy and efficiency, demonstrating strong generalizability and practical potential for cell image classification.
format Article
id doaj-art-b8be16c146c4449e92bffc08443e7676
institution Kabale University
issn 2313-433X
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Journal of Imaging
spelling doaj-art-b8be16c146c4449e92bffc08443e76762025-08-20T03:36:21ZengMDPI AGJournal of Imaging2313-433X2025-07-0111721910.3390/jimaging11070219ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell ClassificationZhangda Liu0Panpan Wu1Ziping Zhao2Hengyong Yu3College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, ChinaCollege of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, ChinaCollege of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, ChinaDepartment of Electrical and Computer Engineering, University of Massachusetts Lowell, Lowell, MA 01854, USAMicroscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these constraints, we propose an efficient classification method that balances strong feature representation with a lightweight design. Specifically, an Inception-Linear Attention-based Lightweight Vision Transformer (ILViT) model is developed for microscopic cell classification. The ILViT integrates two innovative modules: Dynamic Inception Convolution (DIC) and Contrastive Omni-Kolmogorov Attention (COKA). DIC combines dynamic and Inception-style convolutions to replace large kernels with fewer parameters. COKA integrates Omni-Dimensional Dynamic Convolution (ODC), linear attention, and a Kolmogorov-Arnold Network(KAN) structure to enhance feature learning and model interpretability. With only 1.91 GFLOPs and 8.98 million parameters, ILViT achieves high efficiency. Extensive experiments on four public datasets are conducted to validate the effectiveness of the proposed method. It achieves an accuracy of 97.185% on BioMediTech dataset for classifying retinal pigment epithelial cells, 97.436% on ICPR-HEp-2 dataset for diagnosing autoimmune disorders via HEp-2 cell classification, 90.528% on Hematological Malignancy Bone Marrow Cytology Expert Annotation dataset for categorizing bone marrow cells, and 99.758% on a white blood cell dataset for distinguishing leukocyte subtypes. These results show that ILViT outperforms the state-of-the-art models in both accuracy and efficiency, demonstrating strong generalizability and practical potential for cell image classification.https://www.mdpi.com/2313-433X/11/7/219cell classificationlinear attentioninception architecture
spellingShingle Zhangda Liu
Panpan Wu
Ziping Zhao
Hengyong Yu
ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
Journal of Imaging
cell classification
linear attention
inception architecture
title ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
title_full ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
title_fullStr ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
title_full_unstemmed ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
title_short ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
title_sort ilvit an inception linear attention based lightweight vision transformer for microscopic cell classification
topic cell classification
linear attention
inception architecture
url https://www.mdpi.com/2313-433X/11/7/219
work_keys_str_mv AT zhangdaliu ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification
AT panpanwu ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification
AT zipingzhao ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification
AT hengyongyu ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification