Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification

Abstract Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD...

Full description

Saved in:

Bibliographic Details
Main Authors:	Namitha Thalekkara Haridas, Jose M. Sanchez‐Bornot, Paula L. McClean, KongFatt Wong‐Lin, Alzheimer's Disease Neuroimaging Initiative (ADNI)
Format:	Article
Language:	English
Published:	Wiley 2024-12-01
Series:	Healthcare Technology Letters
Subjects:	data mining data reduction decision support systems feature extraction feature selection learning (artificial intelligence)
Online Access:	https://doi.org/10.1049/htl2.12091
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850132598265741312
author	Namitha Thalekkara Haridas Jose M. Sanchez‐Bornot Paula L. McClean KongFatt Wong‐Lin Alzheimer's Disease Neuroimaging Initiative (ADNI)
author_facet	Namitha Thalekkara Haridas Jose M. Sanchez‐Bornot Paula L. McClean KongFatt Wong‐Lin Alzheimer's Disease Neuroimaging Initiative (ADNI)
author_sort	Namitha Thalekkara Haridas
collection	DOAJ
description	Abstract Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD data is limited. This study investigates the efficacy of denoising autoencoder‐based imputation of missing key features of heterogeneous data that comprised tau‐PET, MRI, cognitive and functional assessments, genotype, sociodemographic, and medical history. The authors focused on extreme (≥40%) missing at random of key features which depend on AD progression; identified as the history of a mother having AD, APoE ε4 alleles, and clinical dementia rating. Along with features selected using traditional feature selection methods, latent features extracted from the denoising autoencoder are incorporated for subsequent classification. Using random forest classification with 10‐fold cross‐validation, robust AD predictive performance of imputed datasets (accuracy: 79%–85%; precision: 71%–85%) across missingness levels, and high recall values with 40% missingness are found. Further, the feature‐selected dataset using feature selection methods, including autoencoder, demonstrated higher classification score than that of the original complete dataset. These results highlight the effectiveness and robustness of autoencoder in imputing crucial information for reliable AD prediction in AI‐based clinical decision support systems.
format	Article
id	doaj-art-5e4e2b74107d489cb0ecbfa1117bdec3
institution	OA Journals
issn	2053-3713
language	English
publishDate	2024-12-01
publisher	Wiley
record_format	Article
series	Healthcare Technology Letters
spelling	doaj-art-5e4e2b74107d489cb0ecbfa1117bdec32025-08-20T02:32:11ZengWileyHealthcare Technology Letters2053-37132024-12-0111645246010.1049/htl2.12091Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classificationNamitha Thalekkara Haridas0Jose M. Sanchez‐Bornot1Paula L. McClean2KongFatt Wong‐Lin3Alzheimer's Disease Neuroimaging Initiative (ADNI)Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems Ulster University, Magee campus Derry∼Londonderry Northern Ireland UKIntelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems Ulster University, Magee campus Derry∼Londonderry Northern Ireland UKPersonalised Medicine Centre, School of Medicine Ulster University, Magee campus Derry∼Londonderry Northern Ireland UKIntelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems Ulster University, Magee campus Derry∼Londonderry Northern Ireland UKAbstract Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD data is limited. This study investigates the efficacy of denoising autoencoder‐based imputation of missing key features of heterogeneous data that comprised tau‐PET, MRI, cognitive and functional assessments, genotype, sociodemographic, and medical history. The authors focused on extreme (≥40%) missing at random of key features which depend on AD progression; identified as the history of a mother having AD, APoE ε4 alleles, and clinical dementia rating. Along with features selected using traditional feature selection methods, latent features extracted from the denoising autoencoder are incorporated for subsequent classification. Using random forest classification with 10‐fold cross‐validation, robust AD predictive performance of imputed datasets (accuracy: 79%–85%; precision: 71%–85%) across missingness levels, and high recall values with 40% missingness are found. Further, the feature‐selected dataset using feature selection methods, including autoencoder, demonstrated higher classification score than that of the original complete dataset. These results highlight the effectiveness and robustness of autoencoder in imputing crucial information for reliable AD prediction in AI‐based clinical decision support systems.https://doi.org/10.1049/htl2.12091data miningdata reductiondecision support systemsfeature extractionfeature selectionlearning (artificial intelligence)
spellingShingle	Namitha Thalekkara Haridas Jose M. Sanchez‐Bornot Paula L. McClean KongFatt Wong‐Lin Alzheimer's Disease Neuroimaging Initiative (ADNI) Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification Healthcare Technology Letters data mining data reduction decision support systems feature extraction feature selection learning (artificial intelligence)
title	Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification
title_full	Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification
title_fullStr	Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification
title_full_unstemmed	Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification
title_short	Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification
title_sort	autoencoder imputation of missing heterogeneous data for alzheimer s disease classification
topic	data mining data reduction decision support systems feature extraction feature selection learning (artificial intelligence)
url	https://doi.org/10.1049/htl2.12091
work_keys_str_mv	AT namithathalekkaraharidas autoencoderimputationofmissingheterogeneousdataforalzheimersdiseaseclassification AT josemsanchezbornot autoencoderimputationofmissingheterogeneousdataforalzheimersdiseaseclassification AT paulalmcclean autoencoderimputationofmissingheterogeneousdataforalzheimersdiseaseclassification AT kongfattwonglin autoencoderimputationofmissingheterogeneousdataforalzheimersdiseaseclassification AT alzheimersdiseaseneuroimaginginitiativeadni autoencoderimputationofmissingheterogeneousdataforalzheimersdiseaseclassification

Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification

Similar Items