Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy

Abstract Diabetes mellitus (DM) is a chronic metabolic disorder that poses significant challenges to public health. Among its various complications, diabetic nephropathy (DN) emerges as a critical microvascular complication associated with high mortality rates. Despite the development of diverse the...

Full description

Saved in:
Bibliographic Details
Main Authors: Li Zhang, ZhenPeng Sun, Yao Yuan, Jie Sheng
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-01628-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849744700243705856
author Li Zhang
ZhenPeng Sun
Yao Yuan
Jie Sheng
author_facet Li Zhang
ZhenPeng Sun
Yao Yuan
Jie Sheng
author_sort Li Zhang
collection DOAJ
description Abstract Diabetes mellitus (DM) is a chronic metabolic disorder that poses significant challenges to public health. Among its various complications, diabetic nephropathy (DN) emerges as a critical microvascular complication associated with high mortality rates. Despite the development of diverse therapeutic strategies targeting metabolic improvement, hemodynamic regulation, and fibrosis mitigation, the precise mechanisms responsible for glomerular injury in DN are not yet fully elucidated. To explore these mechanisms, public DN datasets (GSE30528, GSE104948, and GSE96804) were obtained from the GEO database. We merged the GSE30528 and GSE104948 datasets to identify differentially expressed genes (DEGs) between DN and control groups using R software. Weighted gene co-expression network analysis (WGCNA) was subsequently employed to discern genes associated with DN in key modules. We utilized Venny software to pinpoint co-expressed genes shared between DEGs and key module genes. These co-expressed genes underwent gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) enrichment analyses. Through LASSO, SVM, and RF methods, we isolated five significant genes: FN1, C1orf21, CD36, CD48, and SRPX2. These genes were further validated using a logistic model and 10-fold cross-validation. The external dataset GSE96804 served to validate the identified biomarkers, while receiver operating characteristic (ROC) curve analysis assessed their diagnostic efficacy for DN. Additionally, GSE104948 facilitated comparison of biomarker expression levels between DN and five other kidney diseases, highlighting their specificity for DN. These biomarkers also enabled the identification and validation of two molecular subtypes characterized by distinct immune profiles. The Nephroseq v5 database corroborated the correlation between biomarkers and clinical data. Furthermore, the GSigDB database was employed to predict protein-drug interactions, with molecular docking confirming the therapeutic potential of these drug targets. Finally, a diabetic mouse model (BKS-db) was constructed, and RT-qPCR experiments validated the reliability of the identified biomarkers. The study identified five biomarkers with robust diagnostic predictive power for DN. Subtype classification based on these biomarkers revealed distinct enrichment pathways and immune cell infiltration profiles, underscoring the close relationship between these genes and immune functions in DN. Drug prediction and molecular docking analyses demonstrated excellent binding affinities of candidate drugs to target proteins. Differential expression analysis between DN and five other kidney diseases indicated that all biomarkers, except C1orf21, were highly expressed in DN. Notably, as the mouse model lacks the C1orf21 gene, RT-qPCR confirmed the upregulated expression of FN1, CD36, CD48, and SRPX2. This study successfully identified five biomarkers with potential diagnostic and therapeutic value for DN. These biomarkers not only offer insights into the regulatory mechanisms underlying glomerular injury but also provide a theoretical foundation for the development of diagnostic biomarkers and therapeutic targets related to DN-associated glomerular injury.
format Article
id doaj-art-90b072d4f2a64361b54bced71c1af0bd
institution DOAJ
issn 2045-2322
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-90b072d4f2a64361b54bced71c1af0bd2025-08-20T03:10:13ZengNature PortfolioScientific Reports2045-23222025-05-0115112110.1038/s41598-025-01628-5Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathyLi Zhang0ZhenPeng Sun1Yao Yuan2Jie Sheng3Department of Epidemiology and Statistics, College of Public Health, Zhengzhou UniversityDepartment of Urology, Xi’an Daxing HospitalDepartment of Pharmacology, College of Pharmacy, Army Medical UniversitySchool of Basic Medical Sciences, Chongqing Medical UniversityAbstract Diabetes mellitus (DM) is a chronic metabolic disorder that poses significant challenges to public health. Among its various complications, diabetic nephropathy (DN) emerges as a critical microvascular complication associated with high mortality rates. Despite the development of diverse therapeutic strategies targeting metabolic improvement, hemodynamic regulation, and fibrosis mitigation, the precise mechanisms responsible for glomerular injury in DN are not yet fully elucidated. To explore these mechanisms, public DN datasets (GSE30528, GSE104948, and GSE96804) were obtained from the GEO database. We merged the GSE30528 and GSE104948 datasets to identify differentially expressed genes (DEGs) between DN and control groups using R software. Weighted gene co-expression network analysis (WGCNA) was subsequently employed to discern genes associated with DN in key modules. We utilized Venny software to pinpoint co-expressed genes shared between DEGs and key module genes. These co-expressed genes underwent gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) enrichment analyses. Through LASSO, SVM, and RF methods, we isolated five significant genes: FN1, C1orf21, CD36, CD48, and SRPX2. These genes were further validated using a logistic model and 10-fold cross-validation. The external dataset GSE96804 served to validate the identified biomarkers, while receiver operating characteristic (ROC) curve analysis assessed their diagnostic efficacy for DN. Additionally, GSE104948 facilitated comparison of biomarker expression levels between DN and five other kidney diseases, highlighting their specificity for DN. These biomarkers also enabled the identification and validation of two molecular subtypes characterized by distinct immune profiles. The Nephroseq v5 database corroborated the correlation between biomarkers and clinical data. Furthermore, the GSigDB database was employed to predict protein-drug interactions, with molecular docking confirming the therapeutic potential of these drug targets. Finally, a diabetic mouse model (BKS-db) was constructed, and RT-qPCR experiments validated the reliability of the identified biomarkers. The study identified five biomarkers with robust diagnostic predictive power for DN. Subtype classification based on these biomarkers revealed distinct enrichment pathways and immune cell infiltration profiles, underscoring the close relationship between these genes and immune functions in DN. Drug prediction and molecular docking analyses demonstrated excellent binding affinities of candidate drugs to target proteins. Differential expression analysis between DN and five other kidney diseases indicated that all biomarkers, except C1orf21, were highly expressed in DN. Notably, as the mouse model lacks the C1orf21 gene, RT-qPCR confirmed the upregulated expression of FN1, CD36, CD48, and SRPX2. This study successfully identified five biomarkers with potential diagnostic and therapeutic value for DN. These biomarkers not only offer insights into the regulatory mechanisms underlying glomerular injury but also provide a theoretical foundation for the development of diagnostic biomarkers and therapeutic targets related to DN-associated glomerular injury.https://doi.org/10.1038/s41598-025-01628-5Diabetic nephropathyBiomarkersBioinformaticsGEOGlomerular injuryDrug prediction
spellingShingle Li Zhang
ZhenPeng Sun
Yao Yuan
Jie Sheng
Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
Scientific Reports
Diabetic nephropathy
Biomarkers
Bioinformatics
GEO
Glomerular injury
Drug prediction
title Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
title_full Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
title_fullStr Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
title_full_unstemmed Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
title_short Integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
title_sort integrating bioinformatics and machine learning to identify glomerular injury genes and predict drug targets in diabetic nephropathy
topic Diabetic nephropathy
Biomarkers
Bioinformatics
GEO
Glomerular injury
Drug prediction
url https://doi.org/10.1038/s41598-025-01628-5
work_keys_str_mv AT lizhang integratingbioinformaticsandmachinelearningtoidentifyglomerularinjurygenesandpredictdrugtargetsindiabeticnephropathy
AT zhenpengsun integratingbioinformaticsandmachinelearningtoidentifyglomerularinjurygenesandpredictdrugtargetsindiabeticnephropathy
AT yaoyuan integratingbioinformaticsandmachinelearningtoidentifyglomerularinjurygenesandpredictdrugtargetsindiabeticnephropathy
AT jiesheng integratingbioinformaticsandmachinelearningtoidentifyglomerularinjurygenesandpredictdrugtargetsindiabeticnephropathy