Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study
Lung cancer stands as the most prevalent and deadliest type of cancer, with adenocarcinoma being the most common subtype. Computed Tomography (CT) is widely used for detecting tumours and their phenotype characteristics, for an early and accurate diagnosis that impacts patient outcomes. Machine lear...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-01-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/3/1148 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850068365101498368 |
|---|---|
| author | Margarida Gouveia Tânia Mendes Eduardo M. Rodrigues Hélder P. Oliveira Tania Pereira |
| author_facet | Margarida Gouveia Tânia Mendes Eduardo M. Rodrigues Hélder P. Oliveira Tania Pereira |
| author_sort | Margarida Gouveia |
| collection | DOAJ |
| description | Lung cancer stands as the most prevalent and deadliest type of cancer, with adenocarcinoma being the most common subtype. Computed Tomography (CT) is widely used for detecting tumours and their phenotype characteristics, for an early and accurate diagnosis that impacts patient outcomes. Machine learning algorithms have already shown the potential to recognize patterns in CT scans to classify the cancer subtype. In this work, two distinct pipelines were employed to perform binary classification between adenocarcinoma and non-adenocarcinoma. Firstly, radiomic features were classified by Random Forest and eXtreme Gradient Boosting classifiers. Next, a deep learning approach, based on a Residual Neural Network and a Transformer-based architecture, was utilised. Both 2D and 3D CT data were initially explored, with the Lung-PET-CT-Dx dataset being employed for training and the NSCLC-Radiomics and NSCLC-Radiogenomics datasets used for external evaluation. Overall, the 3D models outperformed the 2D ones, with the best result being achieved by the Hybrid Vision Transformer, with an AUC of 0.869 and a balanced accuracy of 0.816 on the internal test set. However, a lack of generalization capability was observed across all models, with the performances decreasing on the external test sets, a limitation that should be studied and addressed in future work. |
| format | Article |
| id | doaj-art-26ed2e91aa9241f58a90375bcc63238a |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-26ed2e91aa9241f58a90375bcc63238a2025-08-20T02:48:06ZengMDPI AGApplied Sciences2076-34172025-01-01153114810.3390/app15031148Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort StudyMargarida Gouveia0Tânia Mendes1Eduardo M. Rodrigues2Hélder P. Oliveira3Tania Pereira4Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, PortugalInstitute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, PortugalInstitute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, PortugalInstitute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, PortugalInstitute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, PortugalLung cancer stands as the most prevalent and deadliest type of cancer, with adenocarcinoma being the most common subtype. Computed Tomography (CT) is widely used for detecting tumours and their phenotype characteristics, for an early and accurate diagnosis that impacts patient outcomes. Machine learning algorithms have already shown the potential to recognize patterns in CT scans to classify the cancer subtype. In this work, two distinct pipelines were employed to perform binary classification between adenocarcinoma and non-adenocarcinoma. Firstly, radiomic features were classified by Random Forest and eXtreme Gradient Boosting classifiers. Next, a deep learning approach, based on a Residual Neural Network and a Transformer-based architecture, was utilised. Both 2D and 3D CT data were initially explored, with the Lung-PET-CT-Dx dataset being employed for training and the NSCLC-Radiomics and NSCLC-Radiogenomics datasets used for external evaluation. Overall, the 3D models outperformed the 2D ones, with the best result being achieved by the Hybrid Vision Transformer, with an AUC of 0.869 and a balanced accuracy of 0.816 on the internal test set. However, a lack of generalization capability was observed across all models, with the performances decreasing on the external test sets, a limitation that should be studied and addressed in future work.https://www.mdpi.com/2076-3417/15/3/1148adenocarcinomacomputed tomography scansdeep learningeXtreme gradient boostinglung cancer subtypemachine learning |
| spellingShingle | Margarida Gouveia Tânia Mendes Eduardo M. Rodrigues Hélder P. Oliveira Tania Pereira Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study Applied Sciences adenocarcinoma computed tomography scans deep learning eXtreme gradient boosting lung cancer subtype machine learning |
| title | Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study |
| title_full | Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study |
| title_fullStr | Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study |
| title_full_unstemmed | Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study |
| title_short | Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study |
| title_sort | comparing 2d and 3d feature extraction methods for lung adenocarcinoma prediction using ct scans a cross cohort study |
| topic | adenocarcinoma computed tomography scans deep learning eXtreme gradient boosting lung cancer subtype machine learning |
| url | https://www.mdpi.com/2076-3417/15/3/1148 |
| work_keys_str_mv | AT margaridagouveia comparing2dand3dfeatureextractionmethodsforlungadenocarcinomapredictionusingctscansacrosscohortstudy AT taniamendes comparing2dand3dfeatureextractionmethodsforlungadenocarcinomapredictionusingctscansacrosscohortstudy AT eduardomrodrigues comparing2dand3dfeatureextractionmethodsforlungadenocarcinomapredictionusingctscansacrosscohortstudy AT helderpoliveira comparing2dand3dfeatureextractionmethodsforlungadenocarcinomapredictionusingctscansacrosscohortstudy AT taniapereira comparing2dand3dfeatureextractionmethodsforlungadenocarcinomapredictionusingctscansacrosscohortstudy |