Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model

Artificial Intelligence (AI)’s capacity to analyze dermoscopic images promises a groundbreaking leap in skin cancer diagnostics, offering exceptional accuracy and an effortlessly non-invasive image acquisition process. However, this immense potential, which has ignited widespread research...

Full description

Saved in:
Bibliographic Details
Main Authors: Lawhori Chakrabarti, Boyu Zhang, Hengyi Tian, Aleksandar Vakanski, Min Xian
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11091320/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849391586035630080
author Lawhori Chakrabarti
Boyu Zhang
Hengyi Tian
Aleksandar Vakanski
Min Xian
author_facet Lawhori Chakrabarti
Boyu Zhang
Hengyi Tian
Aleksandar Vakanski
Min Xian
author_sort Lawhori Chakrabarti
collection DOAJ
description Artificial Intelligence (AI)’s capacity to analyze dermoscopic images promises a groundbreaking leap in skin cancer diagnostics, offering exceptional accuracy and an effortlessly non-invasive image acquisition process. However, this immense potential, which has ignited widespread research enthusiasm, is critically undermined due to the lack of transparency and interpretability. The automated generation of articulate and comprehensive diagnostic reports will bridge this critical gap by not only illuminate the AI’s diagnostic rational but also substantially reduce the demanding workload of the medical professionals. This study presents a multimodal vision-language model (VLM) trained using a two-stage knowledge distillation (KD) framework to generate structured medical reports from dermoscopic images, with descriptive features based on the 7-point melanoma checklist. The reports are organized into clinically relevant sections—Findings, Impression, and Differential Diagnosis—aligned with dermatological standards. Experimental evaluation demonstrates the system’s ability to produce accurate and interpretable reports. Human feedback from a medical professional, assessing clinical relevance, completeness, and interpretability, supports the utility of the generated reports, while computational metrics validate their accuracy and alignment with reference pseudo-reports, achieving a SacreBLEU score of 55.59, a ROUGE-1 score of 0.5438, a ROUGE-L score of 0.3828, and a BERTScore F1 of 0.9025. These findings underscore the model’s ability to generalize effectively to unseen data, enabled by its multimodal design, clinical alignment, and explainability.
format Article
id doaj-art-7f48d933c2734f968efca1eca4994bf7
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-7f48d933c2734f968efca1eca4994bf72025-08-20T03:41:01ZengIEEEIEEE Access2169-35362025-01-011313632013633510.1109/ACCESS.2025.359175011091320Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language ModelLawhori Chakrabarti0Boyu Zhang1https://orcid.org/0000-0002-9401-6163Hengyi Tian2Aleksandar Vakanski3https://orcid.org/0000-0003-3365-1291Min Xian4https://orcid.org/0000-0001-6098-4441Department of Computer Science, University of Idaho, Moscow, ID, USADepartment of Computer Science, University of Idaho, Moscow, ID, USADepartment of Computer Science, University of Idaho, Moscow, ID, USADepartment of Nuclear Engineering and Industrial Management, University of Idaho, Idaho Falls, ID, USADepartment of Computer Science, University of Idaho, Idaho Falls, ID, USAArtificial Intelligence (AI)’s capacity to analyze dermoscopic images promises a groundbreaking leap in skin cancer diagnostics, offering exceptional accuracy and an effortlessly non-invasive image acquisition process. However, this immense potential, which has ignited widespread research enthusiasm, is critically undermined due to the lack of transparency and interpretability. The automated generation of articulate and comprehensive diagnostic reports will bridge this critical gap by not only illuminate the AI’s diagnostic rational but also substantially reduce the demanding workload of the medical professionals. This study presents a multimodal vision-language model (VLM) trained using a two-stage knowledge distillation (KD) framework to generate structured medical reports from dermoscopic images, with descriptive features based on the 7-point melanoma checklist. The reports are organized into clinically relevant sections—Findings, Impression, and Differential Diagnosis—aligned with dermatological standards. Experimental evaluation demonstrates the system’s ability to produce accurate and interpretable reports. Human feedback from a medical professional, assessing clinical relevance, completeness, and interpretability, supports the utility of the generated reports, while computational metrics validate their accuracy and alignment with reference pseudo-reports, achieving a SacreBLEU score of 55.59, a ROUGE-1 score of 0.5438, a ROUGE-L score of 0.3828, and a BERTScore F1 of 0.9025. These findings underscore the model’s ability to generalize effectively to unseen data, enabled by its multimodal design, clinical alignment, and explainability.https://ieeexplore.ieee.org/document/11091320/Melanomadermoscopy imageexplainable AImedical report generationskin cancervision-language models
spellingShingle Lawhori Chakrabarti
Boyu Zhang
Hengyi Tian
Aleksandar Vakanski
Min Xian
Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
IEEE Access
Melanoma
dermoscopy image
explainable AI
medical report generation
skin cancer
vision-language models
title Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
title_full Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
title_fullStr Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
title_full_unstemmed Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
title_short Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
title_sort automated skin cancer report generation via a knowledge distilled vision language model
topic Melanoma
dermoscopy image
explainable AI
medical report generation
skin cancer
vision-language models
url https://ieeexplore.ieee.org/document/11091320/
work_keys_str_mv AT lawhorichakrabarti automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel
AT boyuzhang automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel
AT hengyitian automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel
AT aleksandarvakanski automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel
AT minxian automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel