Boosting Skin Cancer Classification: A Multi-Scale Attention and Ensemble Approach with Vision Transformers

Skin cancer is a significant global health concern, with melanoma being the most dangerous form, responsible for the majority of skin cancer-related deaths. Early detection of skin cancer is critical, as it can drastically improve survival rates. While deep learning models have achieved impressive r...

Full description

Saved in:
Bibliographic Details
Main Authors: Guang Yang, Suhuai Luo, Peter Greer
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/8/2479
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Skin cancer is a significant global health concern, with melanoma being the most dangerous form, responsible for the majority of skin cancer-related deaths. Early detection of skin cancer is critical, as it can drastically improve survival rates. While deep learning models have achieved impressive results in skin cancer classification, there remain challenges in accurately distinguishing between benign and malignant lesions. In this study, we introduce a novel multi-scale attention-based performance booster inspired by the Vision Transformer (ViT) architecture, which enhances the accuracy of both ViT and convolutional neural network (CNN) models. By leveraging attention maps to identify discriminative regions within skin lesion images, our method improves the models’ focus on diagnostically relevant areas. Additionally, we employ ensemble learning techniques to combine the outputs of several deep learning models using majority voting. Our skin cancer classifier, consisting of ViT and EfficientNet models, achieved a classification accuracy of 95.05% on the ISIC2018 dataset, outperforming individual models. The results demonstrate the effectiveness of integrating attention-based multi-scale learning and ensemble methods in skin cancer classification.
ISSN:1424-8220