Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder

Introduction: Positron Emission Tomography (PET) is a vital modality for investigating brain related disorders. However, data scarcity especially for novel molecular targets like neuroepigenetic enzymes combined with difficult-to-recruit patient populations limits the development of machine learning...

Full description

Saved in:
Bibliographic Details
Main Authors: Tyler N. Meyer, Olga Andreeva, Roger D. Weiss, Wei Ding, Iris Shen, Changning Wang, Ping Chen, Tewodros Mulugeta Dagnew
Format: Article
Language:English
Published: Elsevier 2025-12-01
Series:Neuroscience Informatics
Online Access:http://www.sciencedirect.com/science/article/pii/S2772528625000408
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849226901292318720
author Tyler N. Meyer
Olga Andreeva
Roger D. Weiss
Wei Ding
Iris Shen
Changning Wang
Ping Chen
Tewodros Mulugeta Dagnew
author_facet Tyler N. Meyer
Olga Andreeva
Roger D. Weiss
Wei Ding
Iris Shen
Changning Wang
Ping Chen
Tewodros Mulugeta Dagnew
author_sort Tyler N. Meyer
collection DOAJ
description Introduction: Positron Emission Tomography (PET) is a vital modality for investigating brain related disorders. However, data scarcity especially for novel molecular targets like neuroepigenetic enzymes combined with difficult-to-recruit patient populations limits the development of machine learning (ML) models. Our primary objective is to enhance single-subject classification of neuromolecular imaging data and facilitate biomarker discovery. We demonstrate our approach using histone deacetylase (HDAC) PET/MR imaging in Alcohol Use Disorder (AUD). Methods: We propose Catalysis Training pipeline, a framework that augments real imaging data with high-quality synthetic data generated by a Wasserstein Conditional Generative Adversarial Network (WCGAN). Using [11C]Martinostat PET/MR imaging, we extracted 1-D standardized uptake value ratio (SUVR) tabular features representing HDAC enzyme expression density across eight cingulate subregions. These were used to train and test ML classifiers, including Support Vector Machine (SVM), XGBoost, and Random Forest, under leave-one-out cross-validation. Results: Integrating synthetic data in the training process improved classification accuracy significantly: +26% for XGBoost and Random Forest (from 59% to 85%), and +18% for SVM (from 70% to 88%). Synthetic samples improved model generalizability. Key hemispheric and subregional cingulate HDAC patterns were also identified as potential biomarkers. Conclusion: Our results demonstrate that generative AI can help overcome data scarcity in low-data regime neuroimaging applications. Catalysis Training provides a scalable strategy to enhance ML-driven biomarker discovery and disease classification, especially for rare or difficult-to-study disorders like AUD. Clinically, cingulate HDAC expression measured by [11C]Martinostat PET/MR shows promise as an objective biomarker for AUD, complementing DSM-based diagnosis and informing novel treatment strategies.
format Article
id doaj-art-e6c1303765c541909178ea98e5f71be4
institution Kabale University
issn 2772-5286
language English
publishDate 2025-12-01
publisher Elsevier
record_format Article
series Neuroscience Informatics
spelling doaj-art-e6c1303765c541909178ea98e5f71be42025-08-24T05:15:28ZengElsevierNeuroscience Informatics2772-52862025-12-015410022510.1016/j.neuri.2025.100225Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorderTyler N. Meyer0Olga Andreeva1Roger D. Weiss2Wei Ding3Iris Shen4Changning Wang5Ping Chen6Tewodros Mulugeta Dagnew7Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USADepartment of Engineering, University of Massachusetts Boston, Boston, MA, USADepartment of Psychiatry, Harvard Medical School, Boston, MA, USA; Division of Alcohol, Drugs, and Addiction, McLean Hospital, Belmont, MA, USADepartment of Computer Science, University of Massachusetts Boston, Boston, MA, USAAthinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USAAthinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USADepartment of Engineering, University of Massachusetts Boston, Boston, MA, USAAthinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Corresponding author.Introduction: Positron Emission Tomography (PET) is a vital modality for investigating brain related disorders. However, data scarcity especially for novel molecular targets like neuroepigenetic enzymes combined with difficult-to-recruit patient populations limits the development of machine learning (ML) models. Our primary objective is to enhance single-subject classification of neuromolecular imaging data and facilitate biomarker discovery. We demonstrate our approach using histone deacetylase (HDAC) PET/MR imaging in Alcohol Use Disorder (AUD). Methods: We propose Catalysis Training pipeline, a framework that augments real imaging data with high-quality synthetic data generated by a Wasserstein Conditional Generative Adversarial Network (WCGAN). Using [11C]Martinostat PET/MR imaging, we extracted 1-D standardized uptake value ratio (SUVR) tabular features representing HDAC enzyme expression density across eight cingulate subregions. These were used to train and test ML classifiers, including Support Vector Machine (SVM), XGBoost, and Random Forest, under leave-one-out cross-validation. Results: Integrating synthetic data in the training process improved classification accuracy significantly: +26% for XGBoost and Random Forest (from 59% to 85%), and +18% for SVM (from 70% to 88%). Synthetic samples improved model generalizability. Key hemispheric and subregional cingulate HDAC patterns were also identified as potential biomarkers. Conclusion: Our results demonstrate that generative AI can help overcome data scarcity in low-data regime neuroimaging applications. Catalysis Training provides a scalable strategy to enhance ML-driven biomarker discovery and disease classification, especially for rare or difficult-to-study disorders like AUD. Clinically, cingulate HDAC expression measured by [11C]Martinostat PET/MR shows promise as an objective biomarker for AUD, complementing DSM-based diagnosis and informing novel treatment strategies.http://www.sciencedirect.com/science/article/pii/S2772528625000408
spellingShingle Tyler N. Meyer
Olga Andreeva
Roger D. Weiss
Wei Ding
Iris Shen
Changning Wang
Ping Chen
Tewodros Mulugeta Dagnew
Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder
Neuroscience Informatics
title Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder
title_full Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder
title_fullStr Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder
title_full_unstemmed Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder
title_short Enhancing neuromolecular imaging classification in low-data regimes with generative machine learning: A case study in HDAC PET/MR imaging of alcohol use disorder
title_sort enhancing neuromolecular imaging classification in low data regimes with generative machine learning a case study in hdac pet mr imaging of alcohol use disorder
url http://www.sciencedirect.com/science/article/pii/S2772528625000408
work_keys_str_mv AT tylernmeyer enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT olgaandreeva enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT rogerdweiss enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT weiding enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT irisshen enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT changningwang enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT pingchen enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder
AT tewodrosmulugetadagnew enhancingneuromolecularimagingclassificationinlowdataregimeswithgenerativemachinelearningacasestudyinhdacpetmrimagingofalcoholusedisorder