A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data

Aim: We propose a method for screening full blood count metadata for evidence of communicable and noncommunicable diseases using machine learning (ML). Materials & methods: High dimensional hematology metadata was extracted over an 11-month period from Sysmex hematology analyzers from 43,761 pat...

Full description

Saved in:
Bibliographic Details
Main Authors: Patrick A Gladding, Zina Ayar, Kevin Smith, Prashant Patel, Julia Pearce, Shalini Puwakdandawa, Dianne Tarrant, Jon Atkinson, Elizabeth McChlery, Merit Hanna, Nick Gow, Hasan Bhally, Kerry Read, Prageeth Jayathissa, Jonathan Wallace, Sam Norton, Nick Kasabov, Cristian S Calude, Deborah Steel, Colin Mckenzie
Format: Article
Language:English
Published: Taylor & Francis Group 2021-08-01
Series:Future Science OA
Subjects:
Online Access:https://www.future-science.com/doi/10.2144/fsoa-2020-0207
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850153103906570240
author Patrick A Gladding
Zina Ayar
Kevin Smith
Prashant Patel
Julia Pearce
Shalini Puwakdandawa
Dianne Tarrant
Jon Atkinson
Elizabeth McChlery
Merit Hanna
Nick Gow
Hasan Bhally
Kerry Read
Prageeth Jayathissa
Jonathan Wallace
Sam Norton
Nick Kasabov
Cristian S Calude
Deborah Steel
Colin Mckenzie
author_facet Patrick A Gladding
Zina Ayar
Kevin Smith
Prashant Patel
Julia Pearce
Shalini Puwakdandawa
Dianne Tarrant
Jon Atkinson
Elizabeth McChlery
Merit Hanna
Nick Gow
Hasan Bhally
Kerry Read
Prageeth Jayathissa
Jonathan Wallace
Sam Norton
Nick Kasabov
Cristian S Calude
Deborah Steel
Colin Mckenzie
author_sort Patrick A Gladding
collection DOAJ
description Aim: We propose a method for screening full blood count metadata for evidence of communicable and noncommunicable diseases using machine learning (ML). Materials & methods: High dimensional hematology metadata was extracted over an 11-month period from Sysmex hematology analyzers from 43,761 patients. Predictive models for age, sex and individuality were developed to demonstrate the personalized nature of hematology data. Both numeric and raw flow cytometry data were used for both supervised and unsupervised ML to predict the presence of pneumonia, urinary tract infection and COVID-19. Heart failure was used as an objective to prove method generalizability. Results: Chronological age was predicted by a deep neural network with R2: 0.59; mean absolute error: 12; sex with AUROC: 0.83, phi: 0.47; individuality with 99.7% accuracy, phi: 0.97; pneumonia with AUROC: 0.74, sensitivity 58%, specificity 79%, 95% CI: 0.73–0.75, p < 0.0001; urinary tract infection AUROC: 0.68, sensitivity 52%, specificity 79%, 95% CI: 0.67–0.68, p < 0.0001; COVID-19 AUROC: 0.8, sensitivity 82%, specificity 75%, 95% CI: 0.79–0.8, p = 0.0006; and heart failure area under the receiver operator curve (AUROC): 0.78, sensitivity 72%, specificity 72%, 95% CI: 0.77–0.78; p < 0.0001. Conclusion: ML applied to hematology data could predict communicable and noncommunicable diseases, both at local and global levels.
format Article
id doaj-art-552cd80b07ea4cee9b08e830ba8f61da
institution OA Journals
issn 2056-5623
language English
publishDate 2021-08-01
publisher Taylor & Francis Group
record_format Article
series Future Science OA
spelling doaj-art-552cd80b07ea4cee9b08e830ba8f61da2025-08-20T02:25:48ZengTaylor & Francis GroupFuture Science OA2056-56232021-08-017710.2144/fsoa-2020-0207A machine learning PROGRAM to identify COVID-19 and other diseases from hematology dataPatrick A Gladding0Zina Ayar1Kevin Smith2Prashant Patel3Julia Pearce4Shalini Puwakdandawa5Dianne Tarrant6Jon Atkinson7Elizabeth McChlery8Merit Hanna9Nick Gow10Hasan Bhally11Kerry Read12Prageeth Jayathissa13Jonathan Wallace14Sam Norton15Nick Kasabov16Cristian S Calude17Deborah Steel18Colin Mckenzie191Department of Cardiology, Waitematā District Health Board, Auckland, New Zealand2Clinical Information Services, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand3Clinical laboratory, Waitematā District Health Board, Auckland, New Zealand4Department of Hematology, Waitematā District Health Board, Auckland, New Zealand5Department of Infectious diseases, Waitematā District Health Board, Auckland, New Zealand5Department of Infectious diseases, Waitematā District Health Board, Auckland, New Zealand5Department of Infectious diseases, Waitematā District Health Board, Auckland, New Zealand6Institute for Innovation &amp; Improvement (i3), Waitematā District Health Board, Auckland, New Zealand6Institute for Innovation &amp; Improvement (i3), Waitematā District Health Board, Auckland, New Zealand7Nanix Ltd, Dunedin, New Zealand8Knowledge Engineering &amp; Discovery Research Institute (KEDRI), Auckland University of Technology, Auckland, New Zealand9School of Computer Science, University of Auckland, Auckland, New Zealand10Sysmex New Zealand Ltd, Auckland, New Zealand10Sysmex New Zealand Ltd, Auckland, New ZealandAim: We propose a method for screening full blood count metadata for evidence of communicable and noncommunicable diseases using machine learning (ML). Materials & methods: High dimensional hematology metadata was extracted over an 11-month period from Sysmex hematology analyzers from 43,761 patients. Predictive models for age, sex and individuality were developed to demonstrate the personalized nature of hematology data. Both numeric and raw flow cytometry data were used for both supervised and unsupervised ML to predict the presence of pneumonia, urinary tract infection and COVID-19. Heart failure was used as an objective to prove method generalizability. Results: Chronological age was predicted by a deep neural network with R2: 0.59; mean absolute error: 12; sex with AUROC: 0.83, phi: 0.47; individuality with 99.7% accuracy, phi: 0.97; pneumonia with AUROC: 0.74, sensitivity 58%, specificity 79%, 95% CI: 0.73–0.75, p < 0.0001; urinary tract infection AUROC: 0.68, sensitivity 52%, specificity 79%, 95% CI: 0.67–0.68, p < 0.0001; COVID-19 AUROC: 0.8, sensitivity 82%, specificity 75%, 95% CI: 0.79–0.8, p = 0.0006; and heart failure area under the receiver operator curve (AUROC): 0.78, sensitivity 72%, specificity 72%, 95% CI: 0.77–0.78; p < 0.0001. Conclusion: ML applied to hematology data could predict communicable and noncommunicable diseases, both at local and global levels.https://www.future-science.com/doi/10.2144/fsoa-2020-0207biological ageCOVID-19full blood countheart failurehematologymachine learning
spellingShingle Patrick A Gladding
Zina Ayar
Kevin Smith
Prashant Patel
Julia Pearce
Shalini Puwakdandawa
Dianne Tarrant
Jon Atkinson
Elizabeth McChlery
Merit Hanna
Nick Gow
Hasan Bhally
Kerry Read
Prageeth Jayathissa
Jonathan Wallace
Sam Norton
Nick Kasabov
Cristian S Calude
Deborah Steel
Colin Mckenzie
A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data
Future Science OA
biological age
COVID-19
full blood count
heart failure
hematology
machine learning
title A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data
title_full A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data
title_fullStr A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data
title_full_unstemmed A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data
title_short A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data
title_sort machine learning program to identify covid 19 and other diseases from hematology data
topic biological age
COVID-19
full blood count
heart failure
hematology
machine learning
url https://www.future-science.com/doi/10.2144/fsoa-2020-0207
work_keys_str_mv AT patrickagladding amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT zinaayar amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT kevinsmith amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT prashantpatel amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT juliapearce amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT shalinipuwakdandawa amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT diannetarrant amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT jonatkinson amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT elizabethmcchlery amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT merithanna amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT nickgow amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT hasanbhally amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT kerryread amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT prageethjayathissa amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT jonathanwallace amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT samnorton amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT nickkasabov amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT cristianscalude amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT deborahsteel amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT colinmckenzie amachinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT patrickagladding machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT zinaayar machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT kevinsmith machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT prashantpatel machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT juliapearce machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT shalinipuwakdandawa machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT diannetarrant machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT jonatkinson machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT elizabethmcchlery machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT merithanna machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT nickgow machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT hasanbhally machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT kerryread machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT prageethjayathissa machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT jonathanwallace machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT samnorton machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT nickkasabov machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT cristianscalude machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT deborahsteel machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata
AT colinmckenzie machinelearningprogramtoidentifycovid19andotherdiseasesfromhematologydata