Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model
Artificial Intelligence (AI)’s capacity to analyze dermoscopic images promises a groundbreaking leap in skin cancer diagnostics, offering exceptional accuracy and an effortlessly non-invasive image acquisition process. However, this immense potential, which has ignited widespread research...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11091320/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849391586035630080 |
|---|---|
| author | Lawhori Chakrabarti Boyu Zhang Hengyi Tian Aleksandar Vakanski Min Xian |
| author_facet | Lawhori Chakrabarti Boyu Zhang Hengyi Tian Aleksandar Vakanski Min Xian |
| author_sort | Lawhori Chakrabarti |
| collection | DOAJ |
| description | Artificial Intelligence (AI)’s capacity to analyze dermoscopic images promises a groundbreaking leap in skin cancer diagnostics, offering exceptional accuracy and an effortlessly non-invasive image acquisition process. However, this immense potential, which has ignited widespread research enthusiasm, is critically undermined due to the lack of transparency and interpretability. The automated generation of articulate and comprehensive diagnostic reports will bridge this critical gap by not only illuminate the AI’s diagnostic rational but also substantially reduce the demanding workload of the medical professionals. This study presents a multimodal vision-language model (VLM) trained using a two-stage knowledge distillation (KD) framework to generate structured medical reports from dermoscopic images, with descriptive features based on the 7-point melanoma checklist. The reports are organized into clinically relevant sections—Findings, Impression, and Differential Diagnosis—aligned with dermatological standards. Experimental evaluation demonstrates the system’s ability to produce accurate and interpretable reports. Human feedback from a medical professional, assessing clinical relevance, completeness, and interpretability, supports the utility of the generated reports, while computational metrics validate their accuracy and alignment with reference pseudo-reports, achieving a SacreBLEU score of 55.59, a ROUGE-1 score of 0.5438, a ROUGE-L score of 0.3828, and a BERTScore F1 of 0.9025. These findings underscore the model’s ability to generalize effectively to unseen data, enabled by its multimodal design, clinical alignment, and explainability. |
| format | Article |
| id | doaj-art-7f48d933c2734f968efca1eca4994bf7 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-7f48d933c2734f968efca1eca4994bf72025-08-20T03:41:01ZengIEEEIEEE Access2169-35362025-01-011313632013633510.1109/ACCESS.2025.359175011091320Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language ModelLawhori Chakrabarti0Boyu Zhang1https://orcid.org/0000-0002-9401-6163Hengyi Tian2Aleksandar Vakanski3https://orcid.org/0000-0003-3365-1291Min Xian4https://orcid.org/0000-0001-6098-4441Department of Computer Science, University of Idaho, Moscow, ID, USADepartment of Computer Science, University of Idaho, Moscow, ID, USADepartment of Computer Science, University of Idaho, Moscow, ID, USADepartment of Nuclear Engineering and Industrial Management, University of Idaho, Idaho Falls, ID, USADepartment of Computer Science, University of Idaho, Idaho Falls, ID, USAArtificial Intelligence (AI)’s capacity to analyze dermoscopic images promises a groundbreaking leap in skin cancer diagnostics, offering exceptional accuracy and an effortlessly non-invasive image acquisition process. However, this immense potential, which has ignited widespread research enthusiasm, is critically undermined due to the lack of transparency and interpretability. The automated generation of articulate and comprehensive diagnostic reports will bridge this critical gap by not only illuminate the AI’s diagnostic rational but also substantially reduce the demanding workload of the medical professionals. This study presents a multimodal vision-language model (VLM) trained using a two-stage knowledge distillation (KD) framework to generate structured medical reports from dermoscopic images, with descriptive features based on the 7-point melanoma checklist. The reports are organized into clinically relevant sections—Findings, Impression, and Differential Diagnosis—aligned with dermatological standards. Experimental evaluation demonstrates the system’s ability to produce accurate and interpretable reports. Human feedback from a medical professional, assessing clinical relevance, completeness, and interpretability, supports the utility of the generated reports, while computational metrics validate their accuracy and alignment with reference pseudo-reports, achieving a SacreBLEU score of 55.59, a ROUGE-1 score of 0.5438, a ROUGE-L score of 0.3828, and a BERTScore F1 of 0.9025. These findings underscore the model’s ability to generalize effectively to unseen data, enabled by its multimodal design, clinical alignment, and explainability.https://ieeexplore.ieee.org/document/11091320/Melanomadermoscopy imageexplainable AImedical report generationskin cancervision-language models |
| spellingShingle | Lawhori Chakrabarti Boyu Zhang Hengyi Tian Aleksandar Vakanski Min Xian Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model IEEE Access Melanoma dermoscopy image explainable AI medical report generation skin cancer vision-language models |
| title | Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model |
| title_full | Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model |
| title_fullStr | Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model |
| title_full_unstemmed | Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model |
| title_short | Automated Skin Cancer Report Generation via a Knowledge-Distilled Vision-Language Model |
| title_sort | automated skin cancer report generation via a knowledge distilled vision language model |
| topic | Melanoma dermoscopy image explainable AI medical report generation skin cancer vision-language models |
| url | https://ieeexplore.ieee.org/document/11091320/ |
| work_keys_str_mv | AT lawhorichakrabarti automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel AT boyuzhang automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel AT hengyitian automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel AT aleksandarvakanski automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel AT minxian automatedskincancerreportgenerationviaaknowledgedistilledvisionlanguagemodel |