A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data

<b>Background/Objectives</b>: Although transformer-based deep learning architectures are preferred in many hybrid architectures due to their flexibility, they generally perform poorly on image classification tasks with small datasets. An important improvement in performance when transfor...

Full description

Saved in:
Bibliographic Details
Main Authors: Aynur Sevinc, Murat Ucan, Buket Kaya
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/15/7/929
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849738685892788224
author Aynur Sevinc
Murat Ucan
Buket Kaya
author_facet Aynur Sevinc
Murat Ucan
Buket Kaya
author_sort Aynur Sevinc
collection DOAJ
description <b>Background/Objectives</b>: Although transformer-based deep learning architectures are preferred in many hybrid architectures due to their flexibility, they generally perform poorly on image classification tasks with small datasets. An important improvement in performance when transformer architectures work with limited data is the use of distillation techniques. The impact of distillation techniques on classification accuracy in transformer-based models has not yet been extensively investigated. <b>Methods</b>: This study investigates the impact of distillation techniques on the classification performance of transformer-based deep learning architectures trained on limited data. We use transformer-based models ViTx32 and ViTx16 without distillation and DeiT and BeiT with distillation. A four-class dataset of brain MRI images is used for training and testing. <b>Results</b>: Our experiments show that the DeiT and BeiT architectures with distillation achieve performance gains of 2.2% and 1%, respectively, compared to ViTx16. A more detailed analysis shows that the distillation techniques improve the detection of non-patient individuals by about 4%. Our study also includes a detailed analysis of the training times for each architecture. <b>Conclusions</b>: The results of the experiments show that using distillation techniques in transformer-based deep learning models can significantly improve classification accuracy when working with limited data. Based on these findings, we recommend the use of transformer-based models with distillation, especially in medical applications and other areas where flexible models are developed with limited data.
format Article
id doaj-art-19e45aa0c292495fbca183b9bf7b6607
institution DOAJ
issn 2075-4418
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj-art-19e45aa0c292495fbca183b9bf7b66072025-08-20T03:06:29ZengMDPI AGDiagnostics2075-44182025-04-0115792910.3390/diagnostics15070929A Distillation Approach to Transformer-Based Medical Image Classification with Limited DataAynur Sevinc0Murat Ucan1Buket Kaya2Department of Computer Technologies, Silvan Vocational School, Dicle University, Diyarbakir 21640, TurkeyDepartment of Computer Technologies, Vocational School of Technical Sciences, Dicle University, Diyarbakir 21200, TurkeyDepartment of Electronics and Automation, Firat University, Elazig 23119, Turkey<b>Background/Objectives</b>: Although transformer-based deep learning architectures are preferred in many hybrid architectures due to their flexibility, they generally perform poorly on image classification tasks with small datasets. An important improvement in performance when transformer architectures work with limited data is the use of distillation techniques. The impact of distillation techniques on classification accuracy in transformer-based models has not yet been extensively investigated. <b>Methods</b>: This study investigates the impact of distillation techniques on the classification performance of transformer-based deep learning architectures trained on limited data. We use transformer-based models ViTx32 and ViTx16 without distillation and DeiT and BeiT with distillation. A four-class dataset of brain MRI images is used for training and testing. <b>Results</b>: Our experiments show that the DeiT and BeiT architectures with distillation achieve performance gains of 2.2% and 1%, respectively, compared to ViTx16. A more detailed analysis shows that the distillation techniques improve the detection of non-patient individuals by about 4%. Our study also includes a detailed analysis of the training times for each architecture. <b>Conclusions</b>: The results of the experiments show that using distillation techniques in transformer-based deep learning models can significantly improve classification accuracy when working with limited data. Based on these findings, we recommend the use of transformer-based models with distillation, especially in medical applications and other areas where flexible models are developed with limited data.https://www.mdpi.com/2075-4418/15/7/929BeiTclassificationDeiTdistillationtransformers
spellingShingle Aynur Sevinc
Murat Ucan
Buket Kaya
A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
Diagnostics
BeiT
classification
DeiT
distillation
transformers
title A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_full A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_fullStr A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_full_unstemmed A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_short A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_sort distillation approach to transformer based medical image classification with limited data
topic BeiT
classification
DeiT
distillation
transformers
url https://www.mdpi.com/2075-4418/15/7/929
work_keys_str_mv AT aynursevinc adistillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata
AT muratucan adistillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata
AT buketkaya adistillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata
AT aynursevinc distillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata
AT muratucan distillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata
AT buketkaya distillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata