The Impact of Domain Shift on Predicting Perceived Sleep Quality from Wearables

Machine learning models for personal informatics systems are typically trained offline on <i>records of a specific population of users</i>, resulting in <i>population models.</i> These models may suffer performance degradation in real-world settings due to <i>domain shi...

Full description

Saved in:
Bibliographic Details
Main Authors: Nouran Abdalazim, Leonardo Alchieri, Lidia Alecci, Pietro Barbiero, Silvia Santini
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/13/4012
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning models for personal informatics systems are typically trained offline on <i>records of a specific population of users</i>, resulting in <i>population models.</i> These models may suffer performance degradation in real-world settings due to <i>domain shift</i>, i.e., differences in data distributions across users and contexts. Domain adaptation techniques can address this <i>issue</i> by, <i>e.g.,</i> personalizing models with user-specific data. <i>In this paper, we quantify the impact of domain shift</i> on <i>the performance</i> of both population and personalized models <i>in a specific scenario:</i> sleep quality recognition. <i>To this end, we also collect and make available to the research community the new BiheartS dataset</i>. Our analysis shows <i>that</i> domain shift <i>causes the</i> accuracy of population models <i>to decrease</i> by up to 18.54 percentage points, when <i>used</i> on <i>new data</i>. Personalized models, <i>instead</i>, show robust performance across datasets. However, <i>crafting personalized models typically requires using new data or user-provided labels</i>, limiting their <i>applicability in real settings</i>. To <i>mitigate</i> the limitations <i>of both population and personalized models</i>, we propose a novel unsupervised domain adaptation approach: the cluster-based population model (CBPM). CBPM achieves accuracy improvements of up to 13.45 percentage points <i>w.r.t. population model</i> without requiring <i>the use of</i> user-specific records or <i>labels</i>.
ISSN:1424-8220