Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review

BackgroundWith the rapid advances in artificial intelligence—particularly convolutional neural networks—researchers now exploit CT, PET/CT and other imaging modalities to predict epidermal growth factor receptor (EGFR) mutation status in non-small-cell lung cancer (NSCLC) non-invasively, rapidly and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Liu Haixian, Pang Shu, Li Zhao, Lu Chunfeng, Li Lun
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-07-01
Series:	Frontiers in Oncology
Subjects:	artificial intelligence Non-small cell lung cancer (NSCLC) EGFR mutation deep learning medical imaging
Online Access:	https://www.frontiersin.org/articles/10.3389/fonc.2025.1576461/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849428000621199360
author	Liu Haixian Liu Haixian Pang Shu Pang Shu Li Zhao Li Zhao Lu Chunfeng Lu Chunfeng Li Lun
author_facet	Liu Haixian Liu Haixian Pang Shu Pang Shu Li Zhao Li Zhao Lu Chunfeng Lu Chunfeng Li Lun
author_sort	Liu Haixian
collection	DOAJ
description	BackgroundWith the rapid advances in artificial intelligence—particularly convolutional neural networks—researchers now exploit CT, PET/CT and other imaging modalities to predict epidermal growth factor receptor (EGFR) mutation status in non-small-cell lung cancer (NSCLC) non-invasively, rapidly and repeatably. End-to-end deep-learning models simultaneously perform feature extraction and classification, capturing not only traditional radiomic signatures such as tumour density and texture but also peri-tumoural micro-environmental cues, thereby offering a higher theoretical performance ceiling than hand-crafted radiomics coupled with classical machine learning. Nevertheless, the need for large, well-annotated datasets, the domain shifts introduced by heterogeneous scanning protocols and preprocessing pipelines, and the “black-box” nature of neural networks all hinder clinical adoption. To address fragmented evidence and scarce external validation, we conducted a systematic review to appraise the true performance of deep-learning and radiomics models for EGFR prediction and to identify barriers to clinical translation, thereby establishing a baseline for forthcoming multicentre prospective studies.MethodsFollowing PRISMA 2020, we searched PubMed, Web of Science and IEEE Xplore for studies published between 2018 and 2024. Fifty-nine original articles met the inclusion criteria. QUADAS-2 was applied to the eight studies that developed models using real-world clinical data, and details of external validation strategies and performance metrics were extracted systematically.ResultsThe pooled internal area under the curve (AUC) was 0.78 for radiomics–machine-learning models and 0.84 for deep-learning models. Only 17 studies (29%) reported independent external validation, where the mean AUC fell to 0.77, indicating a marked domain-shift effect. QUADAS-2 showed that 31% of studies had high risk of bias in at least one domain, most frequently in Index Test and Patient Selection.ConclusionAlthough deep-learning models achieved the best internal performance, their reliance on single-centre data, the paucity of external validation and limited code availability preclude their use as stand-alone clinical decision tools. Future work should involve multicentre prospective designs, federated learning, decision-curve analysis and open sharing of models and data to verify generalisability and facilitate clinical integration.
format	Article
id	doaj-art-c1a37ff849ee41b4b3b5cd2e0c41caee
institution	Kabale University
issn	2234-943X
language	English
publishDate	2025-07-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Oncology
spelling	doaj-art-c1a37ff849ee41b4b3b5cd2e0c41caee2025-08-20T03:28:50ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2025-07-011510.3389/fonc.2025.15764611576461Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic reviewLiu Haixian0Liu Haixian1Pang Shu2Pang Shu3Li Zhao4Li Zhao5Lu Chunfeng6Lu Chunfeng7Li Lun8Respiratory and Critical Care Medicine Center, Weifang People’s Hospital, Weifang, ChinaThe First Affiliated Hospital, Shandong Second Medical University, Weifang, ChinaThe First Affiliated Hospital, Shandong Second Medical University, Weifang, ChinaPrecision Pathology Diagnosis Center, Weifang People’s Hospital, Weifang, ChinaRespiratory and Critical Care Medicine Center, Weifang People’s Hospital, Weifang, ChinaThe First Affiliated Hospital, Shandong Second Medical University, Weifang, ChinaThe First Affiliated Hospital, Shandong Second Medical University, Weifang, ChinaCritical care medicine, Weifang People’s Hospital, Weifang, ChinaCollege of Mechanical Engineering and Automation, Weifang University, Weifang, ChinaBackgroundWith the rapid advances in artificial intelligence—particularly convolutional neural networks—researchers now exploit CT, PET/CT and other imaging modalities to predict epidermal growth factor receptor (EGFR) mutation status in non-small-cell lung cancer (NSCLC) non-invasively, rapidly and repeatably. End-to-end deep-learning models simultaneously perform feature extraction and classification, capturing not only traditional radiomic signatures such as tumour density and texture but also peri-tumoural micro-environmental cues, thereby offering a higher theoretical performance ceiling than hand-crafted radiomics coupled with classical machine learning. Nevertheless, the need for large, well-annotated datasets, the domain shifts introduced by heterogeneous scanning protocols and preprocessing pipelines, and the “black-box” nature of neural networks all hinder clinical adoption. To address fragmented evidence and scarce external validation, we conducted a systematic review to appraise the true performance of deep-learning and radiomics models for EGFR prediction and to identify barriers to clinical translation, thereby establishing a baseline for forthcoming multicentre prospective studies.MethodsFollowing PRISMA 2020, we searched PubMed, Web of Science and IEEE Xplore for studies published between 2018 and 2024. Fifty-nine original articles met the inclusion criteria. QUADAS-2 was applied to the eight studies that developed models using real-world clinical data, and details of external validation strategies and performance metrics were extracted systematically.ResultsThe pooled internal area under the curve (AUC) was 0.78 for radiomics–machine-learning models and 0.84 for deep-learning models. Only 17 studies (29%) reported independent external validation, where the mean AUC fell to 0.77, indicating a marked domain-shift effect. QUADAS-2 showed that 31% of studies had high risk of bias in at least one domain, most frequently in Index Test and Patient Selection.ConclusionAlthough deep-learning models achieved the best internal performance, their reliance on single-centre data, the paucity of external validation and limited code availability preclude their use as stand-alone clinical decision tools. Future work should involve multicentre prospective designs, federated learning, decision-curve analysis and open sharing of models and data to verify generalisability and facilitate clinical integration.https://www.frontiersin.org/articles/10.3389/fonc.2025.1576461/fullartificial intelligenceNon-small cell lung cancer (NSCLC)EGFR mutationdeep learningmedical imaging
spellingShingle	Liu Haixian Liu Haixian Pang Shu Pang Shu Li Zhao Li Zhao Lu Chunfeng Lu Chunfeng Li Lun Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review Frontiers in Oncology artificial intelligence Non-small cell lung cancer (NSCLC) EGFR mutation deep learning medical imaging
title	Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review
title_full	Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review
title_fullStr	Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review
title_full_unstemmed	Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review
title_short	Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review
title_sort	machine learning approaches for egfr mutation status prediction in nsclc an updated systematic review
topic	artificial intelligence Non-small cell lung cancer (NSCLC) EGFR mutation deep learning medical imaging
url	https://www.frontiersin.org/articles/10.3389/fonc.2025.1576461/full
work_keys_str_mv	AT liuhaixian machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT liuhaixian machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT pangshu machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT pangshu machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT lizhao machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT lizhao machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT luchunfeng machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT luchunfeng machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview AT lilun machinelearningapproachesforegfrmutationstatuspredictioninnsclcanupdatedsystematicreview

Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review

Similar Items