Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study

Abstract Background Streamlining the clinical procedure of human epidermal growth factor receptor 2 (HER2) examination is challenging. Previous studies neglected the intra-class variability within both HER2-positive and -negative groups and lacked multi-cohort validation. To address this deficiency,...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoping Li, Zhiquan Lin, Chaoran Qiu, Yiwen Zhang, Chuqian Lei, Shaofei Shen, Weibin Zhang, Chan Lai, Weiwen Li, Hui Huang, Tian Qiu
Format: Article
Language:English
Published: BMC 2025-04-01
Series:Breast Cancer Research
Subjects:
Online Access:https://doi.org/10.1186/s13058-025-02008-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849311083250057216
author Xiaoping Li
Zhiquan Lin
Chaoran Qiu
Yiwen Zhang
Chuqian Lei
Shaofei Shen
Weibin Zhang
Chan Lai
Weiwen Li
Hui Huang
Tian Qiu
author_facet Xiaoping Li
Zhiquan Lin
Chaoran Qiu
Yiwen Zhang
Chuqian Lei
Shaofei Shen
Weibin Zhang
Chan Lai
Weiwen Li
Hui Huang
Tian Qiu
author_sort Xiaoping Li
collection DOAJ
description Abstract Background Streamlining the clinical procedure of human epidermal growth factor receptor 2 (HER2) examination is challenging. Previous studies neglected the intra-class variability within both HER2-positive and -negative groups and lacked multi-cohort validation. To address this deficiency, this study collected data from multiple cohorts to develop a robust model for HER2 scoring utilizing only Hematoxylin&Eosin-stained whole slide images (WSIs). Methods A total of 578 WSIs were collected from five cohorts, including three public and two private datasets. Each WSI underwent adaptive scale cropping. The transfer-learning-based probabilistic aggregation (TL-PA) model and multi-instance learning (MIL)-based models were compared, both of which were trained on Cohort A and validated on Cohorts B–D. The model demonstrating superior performance was further evaluated in the neoadjuvant therapy (NAT) cohort. Scoring performance was assessed using the area under the receiver operating characteristic curve (AUC). Correlation between the model scores and specific grades (HER2 levels, pathological complete response (pCR) status, residual cancer burden (RCB) grades) were evaluated using Spearman rank correlation and Dunn's test. Patch analysis was performed with manually defined features. Results For HER2 scoring, the TL-PA significantly outperformed the MIL-based models, achieving robust AUCs in four validation cohorts (Cohort A: 0.75, Cohort B: 0.75, Cohort C: 0.77, Cohort D: 0.77). Correlation analysis confirmed a moderate association between model scores and manual reader-defined HER2-IHC status (Coefficient (Spearman) = 0.37, P (Spearman) = 0.001) as well as RCB grades (Coefficient (Spearman) = 0.45, P (Spearman) = 0.0006). In Cohort NAT, with the non-pCR as the positive control, the AUC was 0.77. Patch analysis revealed a core-to-peritumoral probability decrease pattern as malignancy spread outward from the lesion's core. Conclusion TL-PA shows robust generalization for HER2 scoring with minimal data; however, it still inadequately capture intra-class variability. This indicates that future deep-learning endeavors should incorporate more detailed annotations to better align the model's focus with the reasoning of pathologists.
format Article
id doaj-art-3d73f1d1057a4109bcdeaf0c8364cf0d
institution Kabale University
issn 1465-542X
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series Breast Cancer Research
spelling doaj-art-3d73f1d1057a4109bcdeaf0c8364cf0d2025-08-20T03:53:32ZengBMCBreast Cancer Research1465-542X2025-04-0127111210.1186/s13058-025-02008-7Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort studyXiaoping Li0Zhiquan Lin1Chaoran Qiu2Yiwen Zhang3Chuqian Lei4Shaofei Shen5Weibin Zhang6Chan Lai7Weiwen Li8Hui Huang9Tian Qiu10Breast Department, Jiangmen Central HospitalHangzhou Dianzi UniversityBreast Department, Jiangmen Central HospitalBreast Department, Jiangmen Central HospitalBreast Department, Jiangmen Central HospitalShanxi Key Lab for Modernization of TCVM, College of Life Science, Shanxi Agricultural UniversityDepartment of Pathology, Jiangmen Central HospitalRadiology Department, Jiangmen Central HospitalBreast Department, Jiangmen Central HospitalDepartment of Breast Surgery, Jiangmen Maternity and Child Health Care HospitalWuyi UniversityAbstract Background Streamlining the clinical procedure of human epidermal growth factor receptor 2 (HER2) examination is challenging. Previous studies neglected the intra-class variability within both HER2-positive and -negative groups and lacked multi-cohort validation. To address this deficiency, this study collected data from multiple cohorts to develop a robust model for HER2 scoring utilizing only Hematoxylin&Eosin-stained whole slide images (WSIs). Methods A total of 578 WSIs were collected from five cohorts, including three public and two private datasets. Each WSI underwent adaptive scale cropping. The transfer-learning-based probabilistic aggregation (TL-PA) model and multi-instance learning (MIL)-based models were compared, both of which were trained on Cohort A and validated on Cohorts B–D. The model demonstrating superior performance was further evaluated in the neoadjuvant therapy (NAT) cohort. Scoring performance was assessed using the area under the receiver operating characteristic curve (AUC). Correlation between the model scores and specific grades (HER2 levels, pathological complete response (pCR) status, residual cancer burden (RCB) grades) were evaluated using Spearman rank correlation and Dunn's test. Patch analysis was performed with manually defined features. Results For HER2 scoring, the TL-PA significantly outperformed the MIL-based models, achieving robust AUCs in four validation cohorts (Cohort A: 0.75, Cohort B: 0.75, Cohort C: 0.77, Cohort D: 0.77). Correlation analysis confirmed a moderate association between model scores and manual reader-defined HER2-IHC status (Coefficient (Spearman) = 0.37, P (Spearman) = 0.001) as well as RCB grades (Coefficient (Spearman) = 0.45, P (Spearman) = 0.0006). In Cohort NAT, with the non-pCR as the positive control, the AUC was 0.77. Patch analysis revealed a core-to-peritumoral probability decrease pattern as malignancy spread outward from the lesion's core. Conclusion TL-PA shows robust generalization for HER2 scoring with minimal data; however, it still inadequately capture intra-class variability. This indicates that future deep-learning endeavors should incorporate more detailed annotations to better align the model's focus with the reasoning of pathologists.https://doi.org/10.1186/s13058-025-02008-7Breast cancerHER2 scoringWhole slide imagesMulti-cohortTransfer learning
spellingShingle Xiaoping Li
Zhiquan Lin
Chaoran Qiu
Yiwen Zhang
Chuqian Lei
Shaofei Shen
Weibin Zhang
Chan Lai
Weiwen Li
Hui Huang
Tian Qiu
Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
Breast Cancer Research
Breast cancer
HER2 scoring
Whole slide images
Multi-cohort
Transfer learning
title Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_full Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_fullStr Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_full_unstemmed Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_short Transfer learning drives automatic HER2 scoring on HE-stained WSIs for breast cancer: a multi-cohort study
title_sort transfer learning drives automatic her2 scoring on he stained wsis for breast cancer a multi cohort study
topic Breast cancer
HER2 scoring
Whole slide images
Multi-cohort
Transfer learning
url https://doi.org/10.1186/s13058-025-02008-7
work_keys_str_mv AT xiaopingli transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT zhiquanlin transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT chaoranqiu transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT yiwenzhang transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT chuqianlei transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT shaofeishen transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT weibinzhang transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT chanlai transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT weiwenli transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT huihuang transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy
AT tianqiu transferlearningdrivesautomaticher2scoringonhestainedwsisforbreastcanceramulticohortstudy