Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation

The UK’s health datasets are among the most comprehensive and inclusive globally, enabling groundbreaking research during the COVID-19 pandemic. However, restrictions on data sharing between secure data environments (SDEs) imposed limitations on the ability to carry out joint analyses across multipl...

Full description

Saved in:
Bibliographic Details
Main Authors: Aziz Sheikh, Cathie Sudlow, Chris Robertson, Steven Kerr
Format: Article
Language:English
Published: BMJ Publishing Group 2025-05-01
Series:BMJ Health & Care Informatics
Online Access:https://informatics.bmj.com/content/32/1/e101384.full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850254303371984896
author Aziz Sheikh
Cathie Sudlow
Chris Robertson
Steven Kerr
author_facet Aziz Sheikh
Cathie Sudlow
Chris Robertson
Steven Kerr
author_sort Aziz Sheikh
collection DOAJ
description The UK’s health datasets are among the most comprehensive and inclusive globally, enabling groundbreaking research during the COVID-19 pandemic. However, restrictions on data sharing between secure data environments (SDEs) imposed limitations on the ability to carry out joint analyses across multiple separate datasets. There are currently significant efforts underway to enable such analyses using methods such as federated analytics (FA) and virtual SDEs. FA involves distributed data analysis without sharing raw data but does require sharing summary statistics. Virtual SDEs in principle allow researchers to access data across multiple SDEs, but in practice, data transfers may be restricted by information governance concerns.Secure multiparty computation (SMPC) is a cryptographic approach that allows multiple parties to perform joint analyses over private datasets with zero information sharing. SMPC may eliminate the need for data-sharing agreements and statistical disclosure control, offering a compelling alternative to FA and virtual SDEs. SMPC comes with a higher computational burden than traditional pooled analysis. However, efficient implementations of SMPC can enable a wide range of practical, secure analyses to be carried out.This perspective reviews the strengths and limitations of FA, virtual SDEs and SMPC as approaches to joint analyses across SDEs. We argue that while efforts to implement FA and virtual SDEs are ongoing in the UK, SMPC remains underexplored. Given its unique advantages, we propose that SMPC deserves greater attention as a transformative solution for enabling secure, cross-SDE analyses of private health data.
format Article
id doaj-art-e63261e617f6472abe0cd466e9acf67c
institution OA Journals
issn 2632-1009
language English
publishDate 2025-05-01
publisher BMJ Publishing Group
record_format Article
series BMJ Health & Care Informatics
spelling doaj-art-e63261e617f6472abe0cd466e9acf67c2025-08-20T01:57:09ZengBMJ Publishing GroupBMJ Health & Care Informatics2632-10092025-05-0132110.1136/bmjhci-2024-101384Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computationAziz Sheikh0Cathie Sudlow1Chris Robertson2Steven Kerr3Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UKThe University of Edinburgh, Edinburgh, UKDepartment of Mathematics and Statistics, University of Strathclyde, Glasgow, Glasgow, UKUsher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UKThe UK’s health datasets are among the most comprehensive and inclusive globally, enabling groundbreaking research during the COVID-19 pandemic. However, restrictions on data sharing between secure data environments (SDEs) imposed limitations on the ability to carry out joint analyses across multiple separate datasets. There are currently significant efforts underway to enable such analyses using methods such as federated analytics (FA) and virtual SDEs. FA involves distributed data analysis without sharing raw data but does require sharing summary statistics. Virtual SDEs in principle allow researchers to access data across multiple SDEs, but in practice, data transfers may be restricted by information governance concerns.Secure multiparty computation (SMPC) is a cryptographic approach that allows multiple parties to perform joint analyses over private datasets with zero information sharing. SMPC may eliminate the need for data-sharing agreements and statistical disclosure control, offering a compelling alternative to FA and virtual SDEs. SMPC comes with a higher computational burden than traditional pooled analysis. However, efficient implementations of SMPC can enable a wide range of practical, secure analyses to be carried out.This perspective reviews the strengths and limitations of FA, virtual SDEs and SMPC as approaches to joint analyses across SDEs. We argue that while efforts to implement FA and virtual SDEs are ongoing in the UK, SMPC remains underexplored. Given its unique advantages, we propose that SMPC deserves greater attention as a transformative solution for enabling secure, cross-SDE analyses of private health data.https://informatics.bmj.com/content/32/1/e101384.full
spellingShingle Aziz Sheikh
Cathie Sudlow
Chris Robertson
Steven Kerr
Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
BMJ Health & Care Informatics
title Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
title_full Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
title_fullStr Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
title_full_unstemmed Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
title_short Enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
title_sort enabling health data analyses across multiple private datasets with no information sharing using secure multiparty computation
url https://informatics.bmj.com/content/32/1/e101384.full
work_keys_str_mv AT azizsheikh enablinghealthdataanalysesacrossmultipleprivatedatasetswithnoinformationsharingusingsecuremultipartycomputation
AT cathiesudlow enablinghealthdataanalysesacrossmultipleprivatedatasetswithnoinformationsharingusingsecuremultipartycomputation
AT chrisrobertson enablinghealthdataanalysesacrossmultipleprivatedatasetswithnoinformationsharingusingsecuremultipartycomputation
AT stevenkerr enablinghealthdataanalysesacrossmultipleprivatedatasetswithnoinformationsharingusingsecuremultipartycomputation