Semi-local Time sensitive Anonymization of Clinical Data

Abstract A method for the anonymization of time-continuous data, which preserves the relation between the time- and value dimension is proposed in this work. The approach protects against linking- and distribution attacks by providing k-anonymity and t-closeness. Distributions can be generated from...

Full description

Saved in:
Bibliographic Details
Main Authors: Freimut Gebhard Herbert Hammer, Mateusz Buglowski, André Stollenwerk
Format: Article
Language:English
Published: Nature Portfolio 2024-12-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-04192-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850102988930023424
author Freimut Gebhard Herbert Hammer
Mateusz Buglowski
André Stollenwerk
author_facet Freimut Gebhard Herbert Hammer
Mateusz Buglowski
André Stollenwerk
author_sort Freimut Gebhard Herbert Hammer
collection DOAJ
description Abstract A method for the anonymization of time-continuous data, which preserves the relation between the time- and value dimension is proposed in this work. The approach protects against linking- and distribution attacks by providing k-anonymity and t-closeness. Distributions can be generated from given sets using Distribution Clustering, according to the similarity of the curves, which serve as a replacement for the population distribution. Before the data is anonymized, it is split along the time-axis using Windowed Fréchet Splitting, to reduce the duration and information loss. The proposed approach employs bucketization using the Fréchet distance with an implicit maximum cost and implied t for closeness and multiple redistribution phases. The information loss, median relative error and achieved t for the closeness is low, and the runtime was reduced with the introduction of semi-local decisions.
format Article
id doaj-art-217abecce01b47e2b99dd78c5b752a34
institution DOAJ
issn 2052-4463
language English
publishDate 2024-12-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-217abecce01b47e2b99dd78c5b752a342025-08-20T02:39:38ZengNature PortfolioScientific Data2052-44632024-12-0111112010.1038/s41597-024-04192-1Semi-local Time sensitive Anonymization of Clinical DataFreimut Gebhard Herbert Hammer0Mateusz Buglowski1André Stollenwerk2RWTH Aachen UniversityRWTH Aachen UniversityRWTH Aachen UniversityAbstract A method for the anonymization of time-continuous data, which preserves the relation between the time- and value dimension is proposed in this work. The approach protects against linking- and distribution attacks by providing k-anonymity and t-closeness. Distributions can be generated from given sets using Distribution Clustering, according to the similarity of the curves, which serve as a replacement for the population distribution. Before the data is anonymized, it is split along the time-axis using Windowed Fréchet Splitting, to reduce the duration and information loss. The proposed approach employs bucketization using the Fréchet distance with an implicit maximum cost and implied t for closeness and multiple redistribution phases. The information loss, median relative error and achieved t for the closeness is low, and the runtime was reduced with the introduction of semi-local decisions.https://doi.org/10.1038/s41597-024-04192-1
spellingShingle Freimut Gebhard Herbert Hammer
Mateusz Buglowski
André Stollenwerk
Semi-local Time sensitive Anonymization of Clinical Data
Scientific Data
title Semi-local Time sensitive Anonymization of Clinical Data
title_full Semi-local Time sensitive Anonymization of Clinical Data
title_fullStr Semi-local Time sensitive Anonymization of Clinical Data
title_full_unstemmed Semi-local Time sensitive Anonymization of Clinical Data
title_short Semi-local Time sensitive Anonymization of Clinical Data
title_sort semi local time sensitive anonymization of clinical data
url https://doi.org/10.1038/s41597-024-04192-1
work_keys_str_mv AT freimutgebhardherberthammer semilocaltimesensitiveanonymizationofclinicaldata
AT mateuszbuglowski semilocaltimesensitiveanonymizationofclinicaldata
AT andrestollenwerk semilocaltimesensitiveanonymizationofclinicaldata