Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study

Abstract Background Streamlining the clinical procedure of human epidermal growth factor receptor 2 (HER2) examination is challenging. Previous studies neglected the intra-class variability within both HER2-positive and -negative groups and lacked multi-cohort validation. To address this deficiency,...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaoping Li, Zhiquan Lin, Chaoran Qiu, Yiwen Zhang, Chuqian Lei, Shaofei Shen, Weibin Zhang, Chan Lai, Weiwen Li, Hui Huang, Tian Qiu
Format:	Article
Language:	English
Published:	BMC 2025-04-01
Series:	Breast Cancer Research
Subjects:	Breast cancer HER2 scoring Whole slide images Multi-cohort Transfer learning
Online Access:	https://doi.org/10.1186/s13058-025-02008-7
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849311083250057216
author	Xiaoping Li Zhiquan Lin Chaoran Qiu Yiwen Zhang Chuqian Lei Shaofei Shen Weibin Zhang Chan Lai Weiwen Li Hui Huang Tian Qiu
author_facet	Xiaoping Li Zhiquan Lin Chaoran Qiu Yiwen Zhang Chuqian Lei Shaofei Shen Weibin Zhang Chan Lai Weiwen Li Hui Huang Tian Qiu
author_sort	Xiaoping Li
collection	DOAJ
description	Abstract Background Streamlining the clinical procedure of human epidermal growth factor receptor 2 (HER2) examination is challenging. Previous studies neglected the intra-class variability within both HER2-positive and -negative groups and lacked multi-cohort validation. To address this deficiency, this study collected data from multiple cohorts to develop a robust model for HER2 scoring utilizing only Hematoxylin&Eosin-stained whole slide images (WSIs). Methods A total of 578 WSIs were collected from five cohorts, including three public and two private datasets. Each WSI underwent adaptive scale cropping. The transfer-learning-based probabilistic aggregation (TL-PA) model and multi-instance learning (MIL)-based models were compared, both of which were trained on Cohort A and validated on Cohorts B–D. The model demonstrating superior performance was further evaluated in the neoadjuvant therapy (NAT) cohort. Scoring performance was assessed using the area under the receiver operating characteristic curve (AUC). Correlation between the model scores and specific grades (HER2 levels, pathological complete response (pCR) status, residual cancer burden (RCB) grades) were evaluated using Spearman rank correlation and Dunn's test. Patch analysis was performed with manually defined features. Results For HER2 scoring, the TL-PA significantly outperformed the MIL-based models, achieving robust AUCs in four validation cohorts (Cohort A: 0.75, Cohort B: 0.75, Cohort C: 0.77, Cohort D: 0.77). Correlation analysis confirmed a moderate association between model scores and manual reader-defined HER2-IHC status (Coefficient (Spearman) = 0.37, P (Spearman) = 0.001) as well as RCB grades (Coefficient (Spearman) = 0.45, P (Spearman) = 0.0006). In Cohort NAT, with the non-pCR as the positive control, the AUC was 0.77. Patch analysis revealed a core-to-peritumoral probability decrease pattern as malignancy spread outward from the lesion's core. Conclusion TL-PA shows robust generalization for HER2 scoring with minimal data; however, it still inadequately capture intra-class variability. This indicates that future deep-learning endeavors should incorporate more detailed annotations to better align the model's focus with the reasoning of pathologists.
format	Article
id	doaj-art-3d73f1d1057a4109bcdeaf0c8364cf0d
institution	Kabale University
issn	1465-542X
language	English
publishDate	2025-04-01
publisher	BMC
record_format	Article
series	Breast Cancer Research
spelling	doaj-art-3d73f1d1057a4109bcdeaf0c8364cf0d2025-08-20T03:53:32ZengBMCBreast Cancer Research1465-542X2025-04-0127111210.1186/s13058-025-02008-7Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort studyXiaoping Li0Zhiquan Lin1Chaoran Qiu2Yiwen Zhang3Chuqian Lei4Shaofei Shen5Weibin Zhang6Chan Lai7Weiwen Li8Hui Huang9Tian Qiu10Breast Department, Jiangmen Central HospitalHangzhou Dianzi UniversityBreast Department, Jiangmen Central HospitalBreast Department, Jiangmen Central HospitalBreast Department, Jiangmen Central HospitalShanxi Key Lab for Modernization of TCVM, College of Life Science, Shanxi Agricultural UniversityDepartment of Pathology, Jiangmen Central HospitalRadiology Department, Jiangmen Central HospitalBreast Department, Jiangmen Central HospitalDepartment of Breast Surgery, Jiangmen Maternity and Child Health Care HospitalWuyi UniversityAbstract Background Streamlining the clinical procedure of human epidermal growth factor receptor 2 (HER2) examination is challenging. Previous studies neglected the intra-class variability within both HER2-positive and -negative groups and lacked multi-cohort validation. To address this deficiency, this study collected data from multiple cohorts to develop a robust model for HER2 scoring utilizing only Hematoxylin&Eosin-stained whole slide images (WSIs). Methods A total of 578 WSIs were collected from five cohorts, including three public and two private datasets. Each WSI underwent adaptive scale cropping. The transfer-learning-based probabilistic aggregation (TL-PA) model and multi-instance learning (MIL)-based models were compared, both of which were trained on Cohort A and validated on Cohorts B–D. The model demonstrating superior performance was further evaluated in the neoadjuvant therapy (NAT) cohort. Scoring performance was assessed using the area under the receiver operating characteristic curve (AUC). Correlation between the model scores and specific grades (HER2 levels, pathological complete response (pCR) status, residual cancer burden (RCB) grades) were evaluated using Spearman rank correlation and Dunn's test. Patch analysis was performed with manually defined features. Results For HER2 scoring, the TL-PA significantly outperformed the MIL-based models, achieving robust AUCs in four validation cohorts (Cohort A: 0.75, Cohort B: 0.75, Cohort C: 0.77, Cohort D: 0.77). Correlation analysis confirmed a moderate association between model scores and manual reader-defined HER2-IHC status (Coefficient (Spearman) = 0.37, P (Spearman) = 0.001) as well as RCB grades (Coefficient (Spearman) = 0.45, P (Spearman) = 0.0006). In Cohort NAT, with the non-pCR as the positive control, the AUC was 0.77. Patch analysis revealed a core-to-peritumoral probability decrease pattern as malignancy spread outward from the lesion's core. Conclusion TL-PA shows robust generalization for HER2 scoring with minimal data; however, it still inadequately capture intra-class variability. This indicates that future deep-learning endeavors should incorporate more detailed annotations to better align the model's focus with the reasoning of pathologists.https://doi.org/10.1186/s13058-025-02008-7Breast cancerHER2 scoringWhole slide imagesMulti-cohortTransfer learning
spellingShingle	Xiaoping Li Zhiquan Lin Chaoran Qiu Yiwen Zhang Chuqian Lei Shaofei Shen Weibin Zhang Chan Lai Weiwen Li Hui Huang Tian Qiu Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study Breast Cancer Research Breast cancer HER2 scoring Whole slide images Multi-cohort Transfer learning
title	Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_full	Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_fullStr	Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_full_unstemmed	Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_short	Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_sort	transfer learning drives automatic her2 scoring on he stained wsis for breast cancer a multi cohort study
topic	Breast cancer HER2 scoring Whole slide images Multi-cohort Transfer learning
url	https://doi.org/10.1186/s13058-025-02008-7
work_keys_str_mv	AT xiaopingli transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT zhiquanlin transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT chaoranqiu transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT yiwenzhang transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT chuqianlei transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT shaofeishen transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT weibinzhang transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT chanlai transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT weiwenli transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT huihuang transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy AT tianqiu transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy

Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study

Similar Items