Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care

BackgroundWearable sensors are increasingly being explored in health care, including in cancer care, for their potential in continuously monitoring patients. Despite their growing adoption, significant challenges remain in the quality and consistency of data collected from we...

Full description

Saved in:
Bibliographic Details
Main Authors: Bengie L Ortiz, Vibhuti Gupta, Rajnish Kumar, Aditya Jalin, Xiao Cao, Charles Ziegenbein, Ashutosh Singhal, Muneesh Tewari, Sung Won Choi
Format: Article
Language:English
Published: JMIR Publications 2024-09-01
Series:JMIR mHealth and uHealth
Online Access:https://mhealth.jmir.org/2024/1/e59587
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850258179023175680
author Bengie L Ortiz
Vibhuti Gupta
Rajnish Kumar
Aditya Jalin
Xiao Cao
Charles Ziegenbein
Ashutosh Singhal
Muneesh Tewari
Sung Won Choi
author_facet Bengie L Ortiz
Vibhuti Gupta
Rajnish Kumar
Aditya Jalin
Xiao Cao
Charles Ziegenbein
Ashutosh Singhal
Muneesh Tewari
Sung Won Choi
author_sort Bengie L Ortiz
collection DOAJ
description BackgroundWearable sensors are increasingly being explored in health care, including in cancer care, for their potential in continuously monitoring patients. Despite their growing adoption, significant challenges remain in the quality and consistency of data collected from wearable sensors. Moreover, preprocessing pipelines to clean, transform, normalize, and standardize raw data have not yet been fully optimized. ObjectiveThis study aims to conduct a scoping review of preprocessing techniques used on raw wearable sensor data in cancer care, specifically focusing on methods implemented to ensure their readiness for artificial intelligence and machine learning (AI/ML) applications. We sought to understand the current landscape of approaches for handling issues, such as noise, missing values, normalization or standardization, and transformation, as well as techniques for extracting meaningful features from raw sensor outputs and converting them into usable formats for subsequent AI/ML analysis. MethodsWe systematically searched IEEE Xplore, PubMed, Embase, and Scopus to identify potentially relevant studies for this review. The eligibility criteria included (1) mobile health and wearable sensor studies in cancer, (2) written and published in English, (3) published between January 2018 and December 2023, (4) full text available rather than abstracts, and (5) original studies published in peer-reviewed journals or conferences. ResultsThe initial search yielded 2147 articles, of which 20 (0.93%) met the inclusion criteria. Three major categories of preprocessing techniques were identified: data transformation (used in 12/20, 60% of selected studies), data normalization and standardization (used in 8/20, 40% of the selected studies), and data cleaning (used in 8/20, 40% of the selected studies). Transformation methods aimed to convert raw data into more informative formats for analysis, such as by segmenting sensor streams or extracting statistical features. Normalization and standardization techniques usually normalize the range of features to improve comparability and model convergence. Cleaning methods focused on enhancing data reliability by handling artifacts like missing values, outliers, and inconsistencies. ConclusionsWhile wearable sensors are gaining traction in cancer care, realizing their full potential hinges on the ability to reliably translate raw outputs into high-quality data suitable for AI/ML applications. This review found that researchers are using various preprocessing techniques to address this challenge, but there remains a lack of standardized best practices. Our findings suggest a pressing need to develop and adopt uniform data quality and preprocessing workflows of wearable sensor data that can support the breadth of cancer research and varied patient populations. Given the diverse preprocessing techniques identified in the literature, there is an urgency for a framework that can guide researchers and clinicians in preparing wearable sensor data for AI/ML applications. For the scoping review as well as our research, we propose a general framework for preprocessing wearable sensor data, designed to be adaptable across different disease settings, moving beyond cancer care.
format Article
id doaj-art-56a1efa01426449f8e3641471e32ff82
institution OA Journals
issn 2291-5222
language English
publishDate 2024-09-01
publisher JMIR Publications
record_format Article
series JMIR mHealth and uHealth
spelling doaj-art-56a1efa01426449f8e3641471e32ff822025-08-20T01:56:14ZengJMIR PublicationsJMIR mHealth and uHealth2291-52222024-09-0112e5958710.2196/59587Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer CareBengie L Ortizhttps://orcid.org/0000-0002-8484-5902Vibhuti Guptahttps://orcid.org/0000-0002-6221-4712Rajnish Kumarhttps://orcid.org/0009-0002-2611-7330Aditya Jalinhttps://orcid.org/0009-0007-5535-4200Xiao Caohttps://orcid.org/0009-0004-8227-5098Charles Ziegenbeinhttps://orcid.org/0009-0002-0583-2837Ashutosh Singhalhttps://orcid.org/0000-0002-9172-1916Muneesh Tewarihttps://orcid.org/0000-0002-7781-3152Sung Won Choihttps://orcid.org/0000-0002-6321-3834 BackgroundWearable sensors are increasingly being explored in health care, including in cancer care, for their potential in continuously monitoring patients. Despite their growing adoption, significant challenges remain in the quality and consistency of data collected from wearable sensors. Moreover, preprocessing pipelines to clean, transform, normalize, and standardize raw data have not yet been fully optimized. ObjectiveThis study aims to conduct a scoping review of preprocessing techniques used on raw wearable sensor data in cancer care, specifically focusing on methods implemented to ensure their readiness for artificial intelligence and machine learning (AI/ML) applications. We sought to understand the current landscape of approaches for handling issues, such as noise, missing values, normalization or standardization, and transformation, as well as techniques for extracting meaningful features from raw sensor outputs and converting them into usable formats for subsequent AI/ML analysis. MethodsWe systematically searched IEEE Xplore, PubMed, Embase, and Scopus to identify potentially relevant studies for this review. The eligibility criteria included (1) mobile health and wearable sensor studies in cancer, (2) written and published in English, (3) published between January 2018 and December 2023, (4) full text available rather than abstracts, and (5) original studies published in peer-reviewed journals or conferences. ResultsThe initial search yielded 2147 articles, of which 20 (0.93%) met the inclusion criteria. Three major categories of preprocessing techniques were identified: data transformation (used in 12/20, 60% of selected studies), data normalization and standardization (used in 8/20, 40% of the selected studies), and data cleaning (used in 8/20, 40% of the selected studies). Transformation methods aimed to convert raw data into more informative formats for analysis, such as by segmenting sensor streams or extracting statistical features. Normalization and standardization techniques usually normalize the range of features to improve comparability and model convergence. Cleaning methods focused on enhancing data reliability by handling artifacts like missing values, outliers, and inconsistencies. ConclusionsWhile wearable sensors are gaining traction in cancer care, realizing their full potential hinges on the ability to reliably translate raw outputs into high-quality data suitable for AI/ML applications. This review found that researchers are using various preprocessing techniques to address this challenge, but there remains a lack of standardized best practices. Our findings suggest a pressing need to develop and adopt uniform data quality and preprocessing workflows of wearable sensor data that can support the breadth of cancer research and varied patient populations. Given the diverse preprocessing techniques identified in the literature, there is an urgency for a framework that can guide researchers and clinicians in preparing wearable sensor data for AI/ML applications. For the scoping review as well as our research, we propose a general framework for preprocessing wearable sensor data, designed to be adaptable across different disease settings, moving beyond cancer care.https://mhealth.jmir.org/2024/1/e59587
spellingShingle Bengie L Ortiz
Vibhuti Gupta
Rajnish Kumar
Aditya Jalin
Xiao Cao
Charles Ziegenbein
Ashutosh Singhal
Muneesh Tewari
Sung Won Choi
Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care
JMIR mHealth and uHealth
title Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care
title_full Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care
title_fullStr Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care
title_full_unstemmed Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care
title_short Data Preprocessing Techniques for AI and Machine Learning Readiness: Scoping Review of Wearable Sensor Data in Cancer Care
title_sort data preprocessing techniques for ai and machine learning readiness scoping review of wearable sensor data in cancer care
url https://mhealth.jmir.org/2024/1/e59587
work_keys_str_mv AT bengielortiz datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT vibhutigupta datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT rajnishkumar datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT adityajalin datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT xiaocao datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT charlesziegenbein datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT ashutoshsinghal datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT muneeshtewari datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare
AT sungwonchoi datapreprocessingtechniquesforaiandmachinelearningreadinessscopingreviewofwearablesensordataincancercare