Automated histopathological detection and classification of lung cancer with an image pre-processing pipeline and spatial attention with deep neural networks
Lung Cancer is a major cancer in the world and specifically India. Histopathological examination of tumorous tissue biopsy is the gold standard method used to clinically identify the type, sub-type, and stage of cancer. Two of the most prevalent forms of lung cancer: Adenocarcinoma & Squamous Ce...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2024-12-01
|
| Series: | Cogent Engineering |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/23311916.2024.2357182 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Lung Cancer is a major cancer in the world and specifically India. Histopathological examination of tumorous tissue biopsy is the gold standard method used to clinically identify the type, sub-type, and stage of cancer. Two of the most prevalent forms of lung cancer: Adenocarcinoma & Squamous Cell Carcinoma account for nearly 80% of all lung cancer cases, which makes classifying the two subtypes of high importance. Proposed in this study is a data pre-processing pipeline for the H&E-stained lung biopsy images along with a customized EfficientNetB3-based Convolutional Neural Network employing spatial attention, trained on a public three-class lung cancer histopathological image dataset. The pre-processing pipeline employed before training, validation and testing helps enhance features of the histopathological images and removes biases due to stain variations for increased model robustness. The usage of a pre-trained CNN helps the deep learning model generalize better with the pre-trained weights, while the attention mechanism On three-fold validation, the classifier bagged accuracies of 0.9943 ± 0.0012 and 0.9947 ± 0.0018 and combined F1-Scores of 0.9942 ± 0.0042 and 0.9833 ± 0.0216 over the validation and testing data respectively. The high performance of the model combined with its computational efficiency could enable easy deployment of our model without necessitating infrastructure overhaul. |
|---|---|
| ISSN: | 2331-1916 |