Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma

Abstract The microarray and single-cell RNA-sequencing (scRNA-seq) datasets of hepatocellular carcinoma (HCC) were downloaded from the Gene Expression Omnibus (GEO) database. Differential expression analysis and weighted gene co-expression network analysis (WGCNA) were used to identify HCC-related b...

Full description

Saved in:
Bibliographic Details
Main Authors: Gang Wang, Jiaxing Zhang, Yirong Li, Yuyu Zhang, Weiwei Dong, Hengquan Wu, Jinglan Wang, Peiqing Liao, Ziqiang Yuan, Tao Liu, Wenting He
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-95493-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849734836274593792
author Gang Wang
Jiaxing Zhang
Yirong Li
Yuyu Zhang
Weiwei Dong
Hengquan Wu
Jinglan Wang
Peiqing Liao
Ziqiang Yuan
Tao Liu
Wenting He
author_facet Gang Wang
Jiaxing Zhang
Yirong Li
Yuyu Zhang
Weiwei Dong
Hengquan Wu
Jinglan Wang
Peiqing Liao
Ziqiang Yuan
Tao Liu
Wenting He
author_sort Gang Wang
collection DOAJ
description Abstract The microarray and single-cell RNA-sequencing (scRNA-seq) datasets of hepatocellular carcinoma (HCC) were downloaded from the Gene Expression Omnibus (GEO) database. Differential expression analysis and weighted gene co-expression network analysis (WGCNA) were used to identify HCC-related biomarkers. Based on an analysis of scRNA-seq data, several marker genes expressed on tumor cells have been identified. Three machine-learning algorithms were used to identify shared diagnostic genes. Furthermore, logistic regression analysis was conducted to re-evaluate and identify essential biomarkers, which were then employed to develop a diagnostic prediction model. Additionally, AutoDockTools were used for molecular docking to investigate the association between the most sensitive drug and the core proteins. 44 genes were obtained by intersecting the WGCNA results, marker genes from scRNA-seq data, and up-regulated DEGs. Three machine-learning algorithms refined CDKN3, PPIA, PRC1, GMNN, and CENPW as hub biomarkers. GMNN and PRC1 were further selected by logistic regression analysis to build a nomogram. The molecular docking results showed that the drug NPK76-II-72-1 had a good binding ability with the GMNN and PRC1 proteins. The results highlighted CDKN3, PPIA, PRC1, GMNN, and CENPW as potential detection biomarkers for HCC patients. Our research offers novel insights into the diagnosis and treatment of HCC.
format Article
id doaj-art-e8425ab353874c49852a4614ae02ad9e
institution DOAJ
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-e8425ab353874c49852a4614ae02ad9e2025-08-20T03:07:41ZengNature PortfolioScientific Reports2045-23222025-04-0115111710.1038/s41598-025-95493-xIntegrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinomaGang Wang0Jiaxing Zhang1Yirong Li2Yuyu Zhang3Weiwei Dong4Hengquan Wu5Jinglan Wang6Peiqing Liao7Ziqiang Yuan8Tao Liu9Wenting He10School of Basic Medical Sciences, Lanzhou UniversityThe Second Hospital & Clinical Medical School, Lanzhou UniversitySchool of Basic Medical Sciences, Lanzhou UniversityThe Second Hospital & Clinical Medical School, Lanzhou UniversityThe Second Hospital & Clinical Medical School, Lanzhou UniversityThe Second Hospital & Clinical Medical School, Lanzhou UniversitySchool of Basic Medical Sciences, Lanzhou UniversityThe Second Hospital & Clinical Medical School, Lanzhou UniversitySchool of Basic Medical Sciences, Lanzhou UniversitySchool of Basic Medical Sciences, Lanzhou UniversitySchool of Basic Medical Sciences, Lanzhou UniversityAbstract The microarray and single-cell RNA-sequencing (scRNA-seq) datasets of hepatocellular carcinoma (HCC) were downloaded from the Gene Expression Omnibus (GEO) database. Differential expression analysis and weighted gene co-expression network analysis (WGCNA) were used to identify HCC-related biomarkers. Based on an analysis of scRNA-seq data, several marker genes expressed on tumor cells have been identified. Three machine-learning algorithms were used to identify shared diagnostic genes. Furthermore, logistic regression analysis was conducted to re-evaluate and identify essential biomarkers, which were then employed to develop a diagnostic prediction model. Additionally, AutoDockTools were used for molecular docking to investigate the association between the most sensitive drug and the core proteins. 44 genes were obtained by intersecting the WGCNA results, marker genes from scRNA-seq data, and up-regulated DEGs. Three machine-learning algorithms refined CDKN3, PPIA, PRC1, GMNN, and CENPW as hub biomarkers. GMNN and PRC1 were further selected by logistic regression analysis to build a nomogram. The molecular docking results showed that the drug NPK76-II-72-1 had a good binding ability with the GMNN and PRC1 proteins. The results highlighted CDKN3, PPIA, PRC1, GMNN, and CENPW as potential detection biomarkers for HCC patients. Our research offers novel insights into the diagnosis and treatment of HCC.https://doi.org/10.1038/s41598-025-95493-xMachine learningMolecular dockingWGCNABiomarkerHepatocellular carcinoma
spellingShingle Gang Wang
Jiaxing Zhang
Yirong Li
Yuyu Zhang
Weiwei Dong
Hengquan Wu
Jinglan Wang
Peiqing Liao
Ziqiang Yuan
Tao Liu
Wenting He
Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma
Scientific Reports
Machine learning
Molecular docking
WGCNA
Biomarker
Hepatocellular carcinoma
title Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma
title_full Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma
title_fullStr Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma
title_full_unstemmed Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma
title_short Integrating single-cell RNA sequencing, WGCNA, and machine learning to identify key biomarkers in hepatocellular carcinoma
title_sort integrating single cell rna sequencing wgcna and machine learning to identify key biomarkers in hepatocellular carcinoma
topic Machine learning
Molecular docking
WGCNA
Biomarker
Hepatocellular carcinoma
url https://doi.org/10.1038/s41598-025-95493-x
work_keys_str_mv AT gangwang integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT jiaxingzhang integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT yirongli integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT yuyuzhang integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT weiweidong integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT hengquanwu integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT jinglanwang integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT peiqingliao integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT ziqiangyuan integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT taoliu integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma
AT wentinghe integratingsinglecellrnasequencingwgcnaandmachinelearningtoidentifykeybiomarkersinhepatocellularcarcinoma