Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights

Lung cancer (LC) is one of the most frequently diagnosed cancers and remains the leading cause of cancer-related mortality worldwide, representing a significant global health challenge. While numerous common lung diseases (CLDs) are implicated in LC development, the underlying causes of LC originati...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Ali Hossain, Tania Akter Asa, Md. Zulfiker Mahmud, AKM Azad, Mohammad Zahidur Rahman, Mohammad Ali Moni, Ahmed Moustafa
Format: Article
Language:English
Published: Ital Publication 2025-04-01
Series:Emerging Science Journal
Subjects:
Online Access:https://ijournalse.org/index.php/ESJ/article/view/2727
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849729072511320064
author Md Ali Hossain
Tania Akter Asa
Md. Zulfiker Mahmud
AKM Azad
Mohammad Zahidur Rahman
Mohammad Ali Moni
Ahmed Moustafa
author_facet Md Ali Hossain
Tania Akter Asa
Md. Zulfiker Mahmud
AKM Azad
Mohammad Zahidur Rahman
Mohammad Ali Moni
Ahmed Moustafa
author_sort Md Ali Hossain
collection DOAJ
description Lung cancer (LC) is one of the most frequently diagnosed cancers and remains the leading cause of cancer-related mortality worldwide, representing a significant global health challenge. While numerous common lung diseases (CLDs) are implicated in LC development, the underlying causes of LC originating from CLDs remain inadequately elucidated. A thorough exploration of LC’s progression from CLDs is essential; our approach integrated bioinformatics and machine learning, utilizing data from GEO and TCGA databases. We began by identifying differentially expressed genes (DEGs) in LC and CLDs, and our gene-disease network revealed for the first time shared DEGs (LC shares significant genes with TB (36), asthma (10), pneumonia (17), COPD (18), and Idiopathic Pulmonary Fibrosis (IPF) (78)), providing insights into potential connections of LC with CLDs. This analysis not only broadened our understanding of their associations but also identified significant pathways and hub proteins (SPTBN1, KCNA4, SCN7A, KCNQ3, GRIA1, and SDC1) through a protein-protein interaction network (PPI). Furthermore, RNA-seq and clinical data were obtained from the cBioPortal portal for shared DEGs of LC and CLDs, assessing their impact on LC patient survival. Integrated mRNA-Seq and clinical data were analyzed via univariate and multivariate Cox Proportional Hazard models to elucidate the influence of significant genes on survival. Furthermore, we developed and deployed a predictive model leveraging the identified hub genes, which demonstrated high accuracy in predicting LC progression. The identified biomarkers and pathways hold promise for further translational research and potential therapeutic targets, advancing understanding of LC development from CLDs. Additionally, co-expression networks among common genes were explored using the Weighted Gene Co-expression Network Analysis (WGCNA). Finally, the hub genes were validated using the Human Protein Atlas (HPA) database and evaluated through various classification algorithms to ascertain their predictive power and diagnostic potential.   Doi: 10.28991/ESJ-2025-09-02-021 Full Text: PDF
format Article
id doaj-art-ec7a677671e34d5699edfcce4b03ef40
institution DOAJ
issn 2610-9182
language English
publishDate 2025-04-01
publisher Ital Publication
record_format Article
series Emerging Science Journal
spelling doaj-art-ec7a677671e34d5699edfcce4b03ef402025-08-20T03:09:19ZengItal PublicationEmerging Science Journal2610-91822025-04-019291693710.28991/ESJ-2025-09-02-021818Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning InsightsMd Ali Hossain0Tania Akter Asa1Md. Zulfiker Mahmud2AKM Azad3Mohammad Zahidur Rahman4Mohammad Ali Moni5Ahmed Moustafa61) Department of Computer Science & Engineering, Jahangirnagar University, Savar, Dhaka 1342, Bangladesh. 2) Health Informatics Lab, Department of Computer Science & Engineering, Daffodil International University, Dhaka 1216, Bangladesh.2) Health Informatics Lab, Department of Computer Science & Engineering, Daffodil International University, Dhaka 1216, Bangladesh. 3) Department of Computer Science and Engineering, Jagannath University, Dhaka 1100, Bangladesh.Department of Computer Science and Engineering, Jagannath University, Dhaka 1100,Department of Mathematics & Statistics, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 13318,Department of Computer Science & Engineering, Jahangirnagar University, Savar, Dhaka 1342,5) Artificial Intelligence and Cyber Futures Institute, Charles Sturt University, Bathurst 2795, Australia. 6) Rural Health Research Institute, Charles Sturt University, Orange 2800, Australia.7) Department of Human Anatomy and Physiology, Faculty of Health Sciences, University of Johannesburg, Doornfontein, 2094, South Africa. 8) Centre for Data Analytics and School of Psychology, Bond University, Gold Coast, Queensland, 4229, Australia.Lung cancer (LC) is one of the most frequently diagnosed cancers and remains the leading cause of cancer-related mortality worldwide, representing a significant global health challenge. While numerous common lung diseases (CLDs) are implicated in LC development, the underlying causes of LC originating from CLDs remain inadequately elucidated. A thorough exploration of LC’s progression from CLDs is essential; our approach integrated bioinformatics and machine learning, utilizing data from GEO and TCGA databases. We began by identifying differentially expressed genes (DEGs) in LC and CLDs, and our gene-disease network revealed for the first time shared DEGs (LC shares significant genes with TB (36), asthma (10), pneumonia (17), COPD (18), and Idiopathic Pulmonary Fibrosis (IPF) (78)), providing insights into potential connections of LC with CLDs. This analysis not only broadened our understanding of their associations but also identified significant pathways and hub proteins (SPTBN1, KCNA4, SCN7A, KCNQ3, GRIA1, and SDC1) through a protein-protein interaction network (PPI). Furthermore, RNA-seq and clinical data were obtained from the cBioPortal portal for shared DEGs of LC and CLDs, assessing their impact on LC patient survival. Integrated mRNA-Seq and clinical data were analyzed via univariate and multivariate Cox Proportional Hazard models to elucidate the influence of significant genes on survival. Furthermore, we developed and deployed a predictive model leveraging the identified hub genes, which demonstrated high accuracy in predicting LC progression. The identified biomarkers and pathways hold promise for further translational research and potential therapeutic targets, advancing understanding of LC development from CLDs. Additionally, co-expression networks among common genes were explored using the Weighted Gene Co-expression Network Analysis (WGCNA). Finally, the hub genes were validated using the Human Protein Atlas (HPA) database and evaluated through various classification algorithms to ascertain their predictive power and diagnostic potential.   Doi: 10.28991/ESJ-2025-09-02-021 Full Text: PDFhttps://ijournalse.org/index.php/ESJ/article/view/2727lung cancercommonly lung disorderssurvival curvecox ph modelclassification algorithmsppi networkmolecular pathways.
spellingShingle Md Ali Hossain
Tania Akter Asa
Md. Zulfiker Mahmud
AKM Azad
Mohammad Zahidur Rahman
Mohammad Ali Moni
Ahmed Moustafa
Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights
Emerging Science Journal
lung cancer
commonly lung disorders
survival curve
cox ph model
classification algorithms
ppi network
molecular pathways.
title Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights
title_full Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights
title_fullStr Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights
title_full_unstemmed Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights
title_short Genetic Links Between Common Lung Diseases and Lung Cancer Progression: Bioinformatics and Machine Learning Insights
title_sort genetic links between common lung diseases and lung cancer progression bioinformatics and machine learning insights
topic lung cancer
commonly lung disorders
survival curve
cox ph model
classification algorithms
ppi network
molecular pathways.
url https://ijournalse.org/index.php/ESJ/article/view/2727
work_keys_str_mv AT mdalihossain geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights
AT taniaakterasa geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights
AT mdzulfikermahmud geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights
AT akmazad geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights
AT mohammadzahidurrahman geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights
AT mohammadalimoni geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights
AT ahmedmoustafa geneticlinksbetweencommonlungdiseasesandlungcancerprogressionbioinformaticsandmachinelearninginsights