Prediction of new-onset migraine using clinical-genotypic data from the HUNT Study: a machine learning analysis
Abstract Background Migraine is associated with a range of symptoms and comorbid disorders and has a strong genetic basis, but the currently identified risk loci only explain a small portion of the heritability, often termed the “missing heritability”. We aimed to investigate if machine learning cou...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-04-01
|
| Series: | The Journal of Headache and Pain |
| Online Access: | https://doi.org/10.1186/s10194-025-02014-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Background Migraine is associated with a range of symptoms and comorbid disorders and has a strong genetic basis, but the currently identified risk loci only explain a small portion of the heritability, often termed the “missing heritability”. We aimed to investigate if machine learning could exploit the combination of genetic data and general clinical features to identify individuals at risk for new-onset migraine. Method This study was a population-based cohort study of adults from the second and third Trøndelag Health Study (HUNT2 and HUNT3). Migraine was captured in a validated questionnaire and based on modified criteria of the International Classification of Headache Disorders (ICHD) and participants underwent genome-wide genotyping. The primary outcome was new-onset migraine defined as a change in disease status from headache-free in HUNT2 to migraine in HUNT3. The migraine risk variants identified in the largest GWAS meta-analysis of migraine were used to identify genetic input features for the models. The general clinical features included demographics, selected comorbidities, medication and stimulant use and non-headache symptoms as predictive factors. Several standard machine learning architectures were constructed, trained, optimized and scored with area under the receiver operating characteristics curve (AUC). The best model during training and validation was used on unseen test sets. Different methods for model explainability were employed. Results A total of 12,995 individuals were included in the predictive modelling (491 new-onset cases). A total of 108 genetic variants and 67 general clinical variables were included in the models. The top performing decision-tree classifier achieved a test set AUC of 0.56 when using only genotypic data, 0.68 when using only clinical data and 0.72 when using both genetic and clinical data. Combining the genotype only and clinical data only models resulted in a lower predictivity with an AUC of 0.67. The most important clinical features were age, marital status and work situation as well as several genetic variants. Conclusion The combination of genotype and routine demographic and non-headache clinical data correctly predict the new onset of migraine in approximately 2 out of 3 cases, supporting that there are important genotypic-phenotypic interactions partaking in the new-onset of migraine. |
|---|---|
| ISSN: | 1129-2377 |