External Validation of AI Models for Skin Diseases: A Systematic Review
The increasing integration of Artificial Intelligence (AI) into dermatology holds significant potential to enhance the diagnosis and prognosis of skin diseases, which represent a considerable public health burden. While numerous AI models for skin conditions have been developed, their successful tra...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11062591/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The increasing integration of Artificial Intelligence (AI) into dermatology holds significant potential to enhance the diagnosis and prognosis of skin diseases, which represent a considerable public health burden. While numerous AI models for skin conditions have been developed, their successful translation into real-world clinical practice critically depends on rigorous external validation to ensure generalizability and reliability beyond their development environment. Despite the growing recognition of external validation’s importance in medical AI, a comprehensive synthesis specifically addressing this process for AI models in dermatology has been lacking. This systematic literature review aimed to identify and synthesize studies focused exclusively on the external validation of AI models for skin diseases, thereby assessing their clinical applicability, methodological robustness, and identifying key challenges. Following the PRISMA framework, a systematic search was conducted across five major digital libraries (ACM Digital Library, IEEE Xplore, Web of Science, PubMed, and Scopus). Study quality was assessed using a customized checklist, and data extraction focused on validation practices, model characteristics, targeted diseases, performance assessment, and identified limitations. The systematic study selection led to a detailed analysis of 41 studies. We observed a clear predominance of retrospective external validation studies focused on diagnostic models, while prognostic models and prospective validation designs were notably scarce. Although some models demonstrated promising performance in external validation, several critical limitations persist. These include the narrow representation of skin disease categories, lack of subgroup fairness analyses, small and imbalanced sample sizes, and insufficient standardization of data collection and evaluation metrics. These findings underscore the need for more rigorous and transparent validation frameworks to support the reliable integration of AI into dermatological practice. |
|---|---|
| ISSN: | 2169-3536 |