An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes

Ulcerative colitis (UC) is a long-lasting inflammatory bowel disease that causes inflammation in the intestines and triggers autoimmune responses. This study aims to identify immune-related biomarkers for ulcerative colitis (UC) and explore potential therapeutic targets. First, we downloaded the exp...

Full description

Saved in:
Bibliographic Details
Main Authors: Na An, Zhongwen Lu, Yang Li, Bing Yang, Shaozhen Ji, Xu Dong, Zhaoliang Ding
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-06-01
Series:Frontiers in Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmed.2025.1571529/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849331536702210048
author Na An
Zhongwen Lu
Yang Li
Bing Yang
Shaozhen Ji
Xu Dong
Zhaoliang Ding
author_facet Na An
Zhongwen Lu
Yang Li
Bing Yang
Shaozhen Ji
Xu Dong
Zhaoliang Ding
author_sort Na An
collection DOAJ
description Ulcerative colitis (UC) is a long-lasting inflammatory bowel disease that causes inflammation in the intestines and triggers autoimmune responses. This study aims to identify immune-related biomarkers for ulcerative colitis (UC) and explore potential therapeutic targets. First, we downloaded the expression profiles of datasets GSE87466, GSE87473, and GSE92415 from the GEO database. Next, we identified differentially expressed genes (DEGs) that are associated with UC. Using the WGCNA algorithm, we screened key module genes in UC and retrieved immune-related genes (IRGs) from the ImmPort database. We identified immune-related differentially expressed genes by intersecting the results from WGCNA, DEGs, and IRGs. To build a diagnostic model for UC, we applied 113 combinations of 12 machine learning algorithms. This included 10-fold cross-validation on the training set and external validation on the test set. The single-cell results presented the cellular profile of UC and indicated that the key genes were significantly associated with macrophages, epithelial cells, and fibroblasts. The single-cell results presented the cell atlas of UC and suggested that key genes were significantly associated with macrophages, epithelial cells and fibroblasts. Quantitative polymerase chain reaction (q-PCR) was used to verify the expression levels of the core biomarkers screened out by machine learning. We conducted enrichment analysis using Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set enrichment analysis (GSEA), which showed biological processes and signaling pathways associated with UC. Immune cell infiltration analysis based on CIBERSORT was also performed. We also screened potential drugs from the DSigDB drug database. To evaluate their effectiveness, we performed molecular docking and dynamics simulations. The results suggested that compounds like thalidomide and troglitazone are promising candidates for new UC drug development. Our findings provide insights into the pathogenesis of UC, its clinical treatment, and potential drug development.
format Article
id doaj-art-4832ff5c265341ba803a8e03f6fbb39e
institution Kabale University
issn 2296-858X
language English
publishDate 2025-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Medicine
spelling doaj-art-4832ff5c265341ba803a8e03f6fbb39e2025-08-20T03:46:33ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-06-011210.3389/fmed.2025.15715291571529An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genesNa An0Zhongwen Lu1Yang Li2Bing Yang3Shaozhen Ji4Xu Dong5Zhaoliang Ding6Shandong University of Traditional Chinese Medicine, Jinan, ChinaThe Third Affiliated Hospital, Beijing University of Chinese Medicine, Beijing, ChinaZibo City Fourth People’s Hospital, Zibo, ChinaRizhao Hospital of Traditional Chinese Medicine, Rizhao, ChinaZibo City Fourth People’s Hospital, Zibo, ChinaShandong University of Traditional Chinese Medicine, Jinan, ChinaRizhao Hospital of Traditional Chinese Medicine, Rizhao, ChinaUlcerative colitis (UC) is a long-lasting inflammatory bowel disease that causes inflammation in the intestines and triggers autoimmune responses. This study aims to identify immune-related biomarkers for ulcerative colitis (UC) and explore potential therapeutic targets. First, we downloaded the expression profiles of datasets GSE87466, GSE87473, and GSE92415 from the GEO database. Next, we identified differentially expressed genes (DEGs) that are associated with UC. Using the WGCNA algorithm, we screened key module genes in UC and retrieved immune-related genes (IRGs) from the ImmPort database. We identified immune-related differentially expressed genes by intersecting the results from WGCNA, DEGs, and IRGs. To build a diagnostic model for UC, we applied 113 combinations of 12 machine learning algorithms. This included 10-fold cross-validation on the training set and external validation on the test set. The single-cell results presented the cellular profile of UC and indicated that the key genes were significantly associated with macrophages, epithelial cells, and fibroblasts. The single-cell results presented the cell atlas of UC and suggested that key genes were significantly associated with macrophages, epithelial cells and fibroblasts. Quantitative polymerase chain reaction (q-PCR) was used to verify the expression levels of the core biomarkers screened out by machine learning. We conducted enrichment analysis using Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set enrichment analysis (GSEA), which showed biological processes and signaling pathways associated with UC. Immune cell infiltration analysis based on CIBERSORT was also performed. We also screened potential drugs from the DSigDB drug database. To evaluate their effectiveness, we performed molecular docking and dynamics simulations. The results suggested that compounds like thalidomide and troglitazone are promising candidates for new UC drug development. Our findings provide insights into the pathogenesis of UC, its clinical treatment, and potential drug development.https://www.frontiersin.org/articles/10.3389/fmed.2025.1571529/fullulcerative colitismachine learningimmunitymolecular dockingdynamicssingle cell
spellingShingle Na An
Zhongwen Lu
Yang Li
Bing Yang
Shaozhen Ji
Xu Dong
Zhaoliang Ding
An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
Frontiers in Medicine
ulcerative colitis
machine learning
immunity
molecular docking
dynamics
single cell
title An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
title_full An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
title_fullStr An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
title_full_unstemmed An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
title_short An integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
title_sort integrated machine learning framework for developing and validating diagnostic models and drug predictions based on ulcerative colitis genes
topic ulcerative colitis
machine learning
immunity
molecular docking
dynamics
single cell
url https://www.frontiersin.org/articles/10.3389/fmed.2025.1571529/full
work_keys_str_mv AT naan anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT zhongwenlu anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT yangli anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT bingyang anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT shaozhenji anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT xudong anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT zhaoliangding anintegratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT naan integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT zhongwenlu integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT yangli integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT bingyang integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT shaozhenji integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT xudong integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes
AT zhaoliangding integratedmachinelearningframeworkfordevelopingandvalidatingdiagnosticmodelsanddrugpredictionsbasedonulcerativecolitisgenes