Population modeling with machine learning can enhance measures of mental health - Open-data replication
Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic trai...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2023-06-01
|
| Series: | NeuroImage: Reports |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2666956023000089 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849703436668370944 |
|---|---|
| author | Ty Easley Ruiqi Chen Kayla Hannon Rosie Dutt Janine Bijsterbosch |
| author_facet | Ty Easley Ruiqi Chen Kayla Hannon Rosie Dutt Janine Bijsterbosch |
| author_sort | Ty Easley |
| collection | DOAJ |
| description | Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution through undersampling of majority scores, and identified data-driven subtypes to investigate the impact of between-participant heterogeneity. Our results replicated prior results from Dadi et al. (2021) in a larger sample. Each data manipulation further led to small but consistent improvements in prediction accuracy, which were largely additive when combining multiple data manipulations. Combining data manipulations (i.e., extended fMRI features, averaged target phenotype, balanced target phenotype distribution) led to a three-fold increase in prediction accuracy for fluid intelligence compared to prior work. These findings highlight the benefit of several relatively easy and low-cost data manipulations, which may positively impact future work. |
| format | Article |
| id | doaj-art-06550f2f7ac04d56ac62eafa7d5459c5 |
| institution | DOAJ |
| issn | 2666-9560 |
| language | English |
| publishDate | 2023-06-01 |
| publisher | Elsevier |
| record_format | Article |
| series | NeuroImage: Reports |
| spelling | doaj-art-06550f2f7ac04d56ac62eafa7d5459c52025-08-20T03:17:18ZengElsevierNeuroImage: Reports2666-95602023-06-013210016310.1016/j.ynirp.2023.100163Population modeling with machine learning can enhance measures of mental health - Open-data replicationTy Easley0Ruiqi Chen1Kayla Hannon2Rosie Dutt3Janine Bijsterbosch4Department of Radiology, Washington University School of Medicine, Saint Louis, Missouri, 63110, USADivision of Biology and Biomedical Sciences, Washington University in St. Louis, Saint Louis, Missouri, 63110, USADepartment of Radiology, Washington University School of Medicine, Saint Louis, Missouri, 63110, USADepartment of Radiology, Washington University School of Medicine, Saint Louis, Missouri, 63110, USADepartment of Radiology, Washington University School of Medicine, Saint Louis, Missouri, 63110, USA; Corresponding author.Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution through undersampling of majority scores, and identified data-driven subtypes to investigate the impact of between-participant heterogeneity. Our results replicated prior results from Dadi et al. (2021) in a larger sample. Each data manipulation further led to small but consistent improvements in prediction accuracy, which were largely additive when combining multiple data manipulations. Combining data manipulations (i.e., extended fMRI features, averaged target phenotype, balanced target phenotype distribution) led to a three-fold increase in prediction accuracy for fluid intelligence compared to prior work. These findings highlight the benefit of several relatively easy and low-cost data manipulations, which may positively impact future work.http://www.sciencedirect.com/science/article/pii/S2666956023000089ReplicationPredictionNeuroticismIntelligenceData pollutionResting state fMRI |
| spellingShingle | Ty Easley Ruiqi Chen Kayla Hannon Rosie Dutt Janine Bijsterbosch Population modeling with machine learning can enhance measures of mental health - Open-data replication NeuroImage: Reports Replication Prediction Neuroticism Intelligence Data pollution Resting state fMRI |
| title | Population modeling with machine learning can enhance measures of mental health - Open-data replication |
| title_full | Population modeling with machine learning can enhance measures of mental health - Open-data replication |
| title_fullStr | Population modeling with machine learning can enhance measures of mental health - Open-data replication |
| title_full_unstemmed | Population modeling with machine learning can enhance measures of mental health - Open-data replication |
| title_short | Population modeling with machine learning can enhance measures of mental health - Open-data replication |
| title_sort | population modeling with machine learning can enhance measures of mental health open data replication |
| topic | Replication Prediction Neuroticism Intelligence Data pollution Resting state fMRI |
| url | http://www.sciencedirect.com/science/article/pii/S2666956023000089 |
| work_keys_str_mv | AT tyeasley populationmodelingwithmachinelearningcanenhancemeasuresofmentalhealthopendatareplication AT ruiqichen populationmodelingwithmachinelearningcanenhancemeasuresofmentalhealthopendatareplication AT kaylahannon populationmodelingwithmachinelearningcanenhancemeasuresofmentalhealthopendatareplication AT rosiedutt populationmodelingwithmachinelearningcanenhancemeasuresofmentalhealthopendatareplication AT janinebijsterbosch populationmodelingwithmachinelearningcanenhancemeasuresofmentalhealthopendatareplication |