A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data

<b>Background/Objectives</b>: Although transformer-based deep learning architectures are preferred in many hybrid architectures due to their flexibility, they generally perform poorly on image classification tasks with small datasets. An important improvement in performance when transfor...

Full description

Saved in:

Bibliographic Details
Main Authors:	Aynur Sevinc, Murat Ucan, Buket Kaya
Format:	Article
Language:	English
Published:	MDPI AG 2025-04-01
Series:	Diagnostics
Subjects:	BeiT classification DeiT distillation transformers
Online Access:	https://www.mdpi.com/2075-4418/15/7/929
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849738685892788224
author	Aynur Sevinc Murat Ucan Buket Kaya
author_facet	Aynur Sevinc Murat Ucan Buket Kaya
author_sort	Aynur Sevinc
collection	DOAJ
description	<b>Background/Objectives</b>: Although transformer-based deep learning architectures are preferred in many hybrid architectures due to their flexibility, they generally perform poorly on image classification tasks with small datasets. An important improvement in performance when transformer architectures work with limited data is the use of distillation techniques. The impact of distillation techniques on classification accuracy in transformer-based models has not yet been extensively investigated. <b>Methods</b>: This study investigates the impact of distillation techniques on the classification performance of transformer-based deep learning architectures trained on limited data. We use transformer-based models ViTx32 and ViTx16 without distillation and DeiT and BeiT with distillation. A four-class dataset of brain MRI images is used for training and testing. <b>Results</b>: Our experiments show that the DeiT and BeiT architectures with distillation achieve performance gains of 2.2% and 1%, respectively, compared to ViTx16. A more detailed analysis shows that the distillation techniques improve the detection of non-patient individuals by about 4%. Our study also includes a detailed analysis of the training times for each architecture. <b>Conclusions</b>: The results of the experiments show that using distillation techniques in transformer-based deep learning models can significantly improve classification accuracy when working with limited data. Based on these findings, we recommend the use of transformer-based models with distillation, especially in medical applications and other areas where flexible models are developed with limited data.
format	Article
id	doaj-art-19e45aa0c292495fbca183b9bf7b6607
institution	DOAJ
issn	2075-4418
language	English
publishDate	2025-04-01
publisher	MDPI AG
record_format	Article
series	Diagnostics
spelling	doaj-art-19e45aa0c292495fbca183b9bf7b66072025-08-20T03:06:29ZengMDPI AGDiagnostics2075-44182025-04-0115792910.3390/diagnostics15070929A Distillation Approach to Transformer-Based Medical Image Classification with Limited DataAynur Sevinc0Murat Ucan1Buket Kaya2Department of Computer Technologies, Silvan Vocational School, Dicle University, Diyarbakir 21640, TurkeyDepartment of Computer Technologies, Vocational School of Technical Sciences, Dicle University, Diyarbakir 21200, TurkeyDepartment of Electronics and Automation, Firat University, Elazig 23119, Turkey<b>Background/Objectives</b>: Although transformer-based deep learning architectures are preferred in many hybrid architectures due to their flexibility, they generally perform poorly on image classification tasks with small datasets. An important improvement in performance when transformer architectures work with limited data is the use of distillation techniques. The impact of distillation techniques on classification accuracy in transformer-based models has not yet been extensively investigated. <b>Methods</b>: This study investigates the impact of distillation techniques on the classification performance of transformer-based deep learning architectures trained on limited data. We use transformer-based models ViTx32 and ViTx16 without distillation and DeiT and BeiT with distillation. A four-class dataset of brain MRI images is used for training and testing. <b>Results</b>: Our experiments show that the DeiT and BeiT architectures with distillation achieve performance gains of 2.2% and 1%, respectively, compared to ViTx16. A more detailed analysis shows that the distillation techniques improve the detection of non-patient individuals by about 4%. Our study also includes a detailed analysis of the training times for each architecture. <b>Conclusions</b>: The results of the experiments show that using distillation techniques in transformer-based deep learning models can significantly improve classification accuracy when working with limited data. Based on these findings, we recommend the use of transformer-based models with distillation, especially in medical applications and other areas where flexible models are developed with limited data.https://www.mdpi.com/2075-4418/15/7/929BeiTclassificationDeiTdistillationtransformers
spellingShingle	Aynur Sevinc Murat Ucan Buket Kaya A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data Diagnostics BeiT classification DeiT distillation transformers
title	A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_full	A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_fullStr	A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_full_unstemmed	A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_short	A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data
title_sort	distillation approach to transformer based medical image classification with limited data
topic	BeiT classification DeiT distillation transformers
url	https://www.mdpi.com/2075-4418/15/7/929
work_keys_str_mv	AT aynursevinc adistillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata AT muratucan adistillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata AT buketkaya adistillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata AT aynursevinc distillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata AT muratucan distillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata AT buketkaya distillationapproachtotransformerbasedmedicalimageclassificationwithlimiteddata

A Distillation Approach to Transformer-Based Medical Image Classification with Limited Data

Similar Items