Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning

Abstract Aneurysmal subarachnoid haemorrhage (aSAH) is a type of stroke with high mortality and morbidity. This study aimed to identify novel aSAH risk factors by combining machine learning (ML) and traditional statistical methods. Using the UK Biobank, we identified aSAH cases via hospital-based IC...

Full description

Saved in:
Bibliographic Details
Main Authors: Jos P. Kanning, Junfeng Wang, Shahab Abtahi, Mirjam I. Geerlings, Ynte M. Ruigrok
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-88826-3
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850094895559081984
author Jos P. Kanning
Junfeng Wang
Shahab Abtahi
Mirjam I. Geerlings
Ynte M. Ruigrok
author_facet Jos P. Kanning
Junfeng Wang
Shahab Abtahi
Mirjam I. Geerlings
Ynte M. Ruigrok
author_sort Jos P. Kanning
collection DOAJ
description Abstract Aneurysmal subarachnoid haemorrhage (aSAH) is a type of stroke with high mortality and morbidity. This study aimed to identify novel aSAH risk factors by combining machine learning (ML) and traditional statistical methods. Using the UK Biobank, we identified aSAH cases via hospital-based ICD codes and analysed 618 baseline variables covering demographics, lifestyle, medical history, and physical measurements. The CatBoost ML algorithm and Shapley Additive Explanations (SHAP) identified the top 25 variables most influential in predicting aSAH. Logistic regression further described these variables while adjusting for established aSAH risk factors. Among 501,847 participants, 893 aSAH cases were identified. ML identified 214 variables with non-zero SHAP values. Logistic regression of the top 25 variables revealed four potential novel aSAH risk factors. Increased aSAH risk was associated with mean sphered cell volume (OR 1.02, 95% CI 1.00-1.03) and tea intake (OR 1.03, 95% CI 1.01–1.05). Decreased aSAH risk was associated with peak expiratory flow (OR 0.80, 95% CI 0.66–0.96), and haematocrit percentage (OR 0.97, 95% CI 0.95-1.00). Future research should validate these findings and explore the potential non-linear relationships and interactions indicated by the ML models.
format Article
id doaj-art-1b4f8a33dd324159af2f9e1870c50f45
institution DOAJ
issn 2045-2322
language English
publishDate 2025-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-1b4f8a33dd324159af2f9e1870c50f452025-08-20T02:41:33ZengNature PortfolioScientific Reports2045-23222025-03-011511910.1038/s41598-025-88826-3Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learningJos P. Kanning0Junfeng Wang1Shahab Abtahi2Mirjam I. Geerlings3Ynte M. Ruigrok4Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center UtrechtDivision of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht UniversityDivision of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht UniversityJulius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht UniversityDepartment of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center UtrechtAbstract Aneurysmal subarachnoid haemorrhage (aSAH) is a type of stroke with high mortality and morbidity. This study aimed to identify novel aSAH risk factors by combining machine learning (ML) and traditional statistical methods. Using the UK Biobank, we identified aSAH cases via hospital-based ICD codes and analysed 618 baseline variables covering demographics, lifestyle, medical history, and physical measurements. The CatBoost ML algorithm and Shapley Additive Explanations (SHAP) identified the top 25 variables most influential in predicting aSAH. Logistic regression further described these variables while adjusting for established aSAH risk factors. Among 501,847 participants, 893 aSAH cases were identified. ML identified 214 variables with non-zero SHAP values. Logistic regression of the top 25 variables revealed four potential novel aSAH risk factors. Increased aSAH risk was associated with mean sphered cell volume (OR 1.02, 95% CI 1.00-1.03) and tea intake (OR 1.03, 95% CI 1.01–1.05). Decreased aSAH risk was associated with peak expiratory flow (OR 0.80, 95% CI 0.66–0.96), and haematocrit percentage (OR 0.97, 95% CI 0.95-1.00). Future research should validate these findings and explore the potential non-linear relationships and interactions indicated by the ML models.https://doi.org/10.1038/s41598-025-88826-3
spellingShingle Jos P. Kanning
Junfeng Wang
Shahab Abtahi
Mirjam I. Geerlings
Ynte M. Ruigrok
Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
Scientific Reports
title Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
title_full Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
title_fullStr Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
title_full_unstemmed Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
title_short Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
title_sort identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning
url https://doi.org/10.1038/s41598-025-88826-3
work_keys_str_mv AT jospkanning identifyingnovelriskfactorsforaneurysmalsubarachnoidhaemorrhageusingmachinelearning
AT junfengwang identifyingnovelriskfactorsforaneurysmalsubarachnoidhaemorrhageusingmachinelearning
AT shahababtahi identifyingnovelriskfactorsforaneurysmalsubarachnoidhaemorrhageusingmachinelearning
AT mirjamigeerlings identifyingnovelriskfactorsforaneurysmalsubarachnoidhaemorrhageusingmachinelearning
AT yntemruigrok identifyingnovelriskfactorsforaneurysmalsubarachnoidhaemorrhageusingmachinelearning