ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
Microscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these const...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Journal of Imaging |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2313-433X/11/7/219 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849406519327588352 |
|---|---|
| author | Zhangda Liu Panpan Wu Ziping Zhao Hengyong Yu |
| author_facet | Zhangda Liu Panpan Wu Ziping Zhao Hengyong Yu |
| author_sort | Zhangda Liu |
| collection | DOAJ |
| description | Microscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these constraints, we propose an efficient classification method that balances strong feature representation with a lightweight design. Specifically, an Inception-Linear Attention-based Lightweight Vision Transformer (ILViT) model is developed for microscopic cell classification. The ILViT integrates two innovative modules: Dynamic Inception Convolution (DIC) and Contrastive Omni-Kolmogorov Attention (COKA). DIC combines dynamic and Inception-style convolutions to replace large kernels with fewer parameters. COKA integrates Omni-Dimensional Dynamic Convolution (ODC), linear attention, and a Kolmogorov-Arnold Network(KAN) structure to enhance feature learning and model interpretability. With only 1.91 GFLOPs and 8.98 million parameters, ILViT achieves high efficiency. Extensive experiments on four public datasets are conducted to validate the effectiveness of the proposed method. It achieves an accuracy of 97.185% on BioMediTech dataset for classifying retinal pigment epithelial cells, 97.436% on ICPR-HEp-2 dataset for diagnosing autoimmune disorders via HEp-2 cell classification, 90.528% on Hematological Malignancy Bone Marrow Cytology Expert Annotation dataset for categorizing bone marrow cells, and 99.758% on a white blood cell dataset for distinguishing leukocyte subtypes. These results show that ILViT outperforms the state-of-the-art models in both accuracy and efficiency, demonstrating strong generalizability and practical potential for cell image classification. |
| format | Article |
| id | doaj-art-b8be16c146c4449e92bffc08443e7676 |
| institution | Kabale University |
| issn | 2313-433X |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Journal of Imaging |
| spelling | doaj-art-b8be16c146c4449e92bffc08443e76762025-08-20T03:36:21ZengMDPI AGJournal of Imaging2313-433X2025-07-0111721910.3390/jimaging11070219ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell ClassificationZhangda Liu0Panpan Wu1Ziping Zhao2Hengyong Yu3College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, ChinaCollege of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, ChinaCollege of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, ChinaDepartment of Electrical and Computer Engineering, University of Massachusetts Lowell, Lowell, MA 01854, USAMicroscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these constraints, we propose an efficient classification method that balances strong feature representation with a lightweight design. Specifically, an Inception-Linear Attention-based Lightweight Vision Transformer (ILViT) model is developed for microscopic cell classification. The ILViT integrates two innovative modules: Dynamic Inception Convolution (DIC) and Contrastive Omni-Kolmogorov Attention (COKA). DIC combines dynamic and Inception-style convolutions to replace large kernels with fewer parameters. COKA integrates Omni-Dimensional Dynamic Convolution (ODC), linear attention, and a Kolmogorov-Arnold Network(KAN) structure to enhance feature learning and model interpretability. With only 1.91 GFLOPs and 8.98 million parameters, ILViT achieves high efficiency. Extensive experiments on four public datasets are conducted to validate the effectiveness of the proposed method. It achieves an accuracy of 97.185% on BioMediTech dataset for classifying retinal pigment epithelial cells, 97.436% on ICPR-HEp-2 dataset for diagnosing autoimmune disorders via HEp-2 cell classification, 90.528% on Hematological Malignancy Bone Marrow Cytology Expert Annotation dataset for categorizing bone marrow cells, and 99.758% on a white blood cell dataset for distinguishing leukocyte subtypes. These results show that ILViT outperforms the state-of-the-art models in both accuracy and efficiency, demonstrating strong generalizability and practical potential for cell image classification.https://www.mdpi.com/2313-433X/11/7/219cell classificationlinear attentioninception architecture |
| spellingShingle | Zhangda Liu Panpan Wu Ziping Zhao Hengyong Yu ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification Journal of Imaging cell classification linear attention inception architecture |
| title | ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification |
| title_full | ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification |
| title_fullStr | ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification |
| title_full_unstemmed | ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification |
| title_short | ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification |
| title_sort | ilvit an inception linear attention based lightweight vision transformer for microscopic cell classification |
| topic | cell classification linear attention inception architecture |
| url | https://www.mdpi.com/2313-433X/11/7/219 |
| work_keys_str_mv | AT zhangdaliu ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification AT panpanwu ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification AT zipingzhao ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification AT hengyongyu ilvitaninceptionlinearattentionbasedlightweightvisiontransformerformicroscopiccellclassification |