HER2-IHC-40x: A high-resolution histopathology dataset for HER2 IHC scoring in breast cancerZenodo

The HER2-IHC-40x and HER2-IHC-40x-WSI datasets are high-resolution whole slide image (WSI) and patch-extracted region collection for HER2 immunohistochemistry (IHC) scoring in breast cancer pathology. 107 WSIs are scanned at 40 × magnification with Regions of Interest (ROIs) annotated by expert path...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Serajun Nabi, Mohammad Faizal Ahmad Fauzi, Zaka Ur Rehman, Hezerul Bin Abdul Karim, Phaik-Leng Cheah, Seow-Fan Chiew, Lai-Meng Looi
Format: Article
Language:English
Published: Elsevier 2025-10-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340925006468
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The HER2-IHC-40x and HER2-IHC-40x-WSI datasets are high-resolution whole slide image (WSI) and patch-extracted region collection for HER2 immunohistochemistry (IHC) scoring in breast cancer pathology. 107 WSIs are scanned at 40 × magnification with Regions of Interest (ROIs) annotated by expert pathologists. Patches of 1024 × 1024 pixels are extracted from the ROIs and classified into four HER2 scores (0, 1+, 2+, 3+), yielding structured data for computational pathology analysis. There were two strategies of splitting: WSI-based split, where data was first split before extracting the patches and named as HER2-IHC-40x for this dataset, the other one is patch-based split, where patches were extracted first and then split, named as HER2-IHC-40x-WSI of this dataset. The filtering method for color histograms was applied to remove the non-tumour regions and artifacts, generating high-quality data. The dataset is applicable to deep learning applications, including HER2 classification and explainable AI. It is freely available on Zenodo, with preprocessing scripts provided via GitHub, enabling reproducibility in digital pathology research.
ISSN:2352-3409