COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data
Abstract Clinical insights from real-world data often require aggregating information from institutions to ensure sufficient sample sizes and generalizability. However, patient privacy concerns only limit the sharing of patient-level data, and traditional federated learning algorithms, relying on ex...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | npj Digital Medicine |
| Online Access: | https://doi.org/10.1038/s41746-025-01781-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849234407732281344 |
|---|---|
| author | Qiong Wu Jenna M. Reps Lu Li Bingyu Zhang Yiwen Lu Jiayi Tong Dazheng Zhang Thomas Lumley Milou T. Brand Mui Van Zandt Thomas Falconer Xing He Yu Huang Haoyang Li Chao Yan Guojun Tang Andrew E. Williams Fei Wang Jiang Bian Bradley Malin George Hripcsak Martijn J. Schuemie Yun Lu Steve Drew Jiayu Zhou David A. Asch Yong Chen |
| author_facet | Qiong Wu Jenna M. Reps Lu Li Bingyu Zhang Yiwen Lu Jiayi Tong Dazheng Zhang Thomas Lumley Milou T. Brand Mui Van Zandt Thomas Falconer Xing He Yu Huang Haoyang Li Chao Yan Guojun Tang Andrew E. Williams Fei Wang Jiang Bian Bradley Malin George Hripcsak Martijn J. Schuemie Yun Lu Steve Drew Jiayu Zhou David A. Asch Yong Chen |
| author_sort | Qiong Wu |
| collection | DOAJ |
| description | Abstract Clinical insights from real-world data often require aggregating information from institutions to ensure sufficient sample sizes and generalizability. However, patient privacy concerns only limit the sharing of patient-level data, and traditional federated learning algorithms, relying on extensive back-and-forth communications, can be inefficient to implement. We introduce the Collaborative One-shot Lossless Algorithm for Generalized Linear Models (COLA-GLM), a novel federated learning algorithm that supports diverse outcome types via generalized linear models and achieves results identical to a pooled patient-level data analysis (lossless) with only a single round of aggregated data exchange (one-shot). To further protect aggregated institutional data, we developed a secure extension, secure-COLA-GLM, utilizing homomorphic encryption. We demonstrated the effectiveness and lossless property of COLA-GLM through applications to an international influenza cohort and a decentralized U.S. COVID-19 mortality study. COLA-GLM and secure-COLA-GLM offer a scalable, efficient solution for decentralized collaborative learning involving multiple data partners and diverse security requirements. |
| format | Article |
| id | doaj-art-a6ac6f3c86984b538bac08b4a37b0784 |
| institution | Kabale University |
| issn | 2398-6352 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Digital Medicine |
| spelling | doaj-art-a6ac6f3c86984b538bac08b4a37b07842025-08-20T04:03:11ZengNature Portfolionpj Digital Medicine2398-63522025-07-018111110.1038/s41746-025-01781-1COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare dataQiong Wu0Jenna M. Reps1Lu Li2Bingyu Zhang3Yiwen Lu4Jiayi Tong5Dazheng Zhang6Thomas Lumley7Milou T. Brand8Mui Van Zandt9Thomas Falconer10Xing He11Yu Huang12Haoyang Li13Chao Yan14Guojun Tang15Andrew E. Williams16Fei Wang17Jiang Bian18Bradley Malin19George Hripcsak20Martijn J. Schuemie21Yun Lu22Steve Drew23Jiayu Zhou24David A. Asch25Yong Chen26Department of Biostatistics and Health Data Science, University of PittsburghObservational Health Data Sciences and InformaticsThe Center for Health AI and Synthesis of Evidence (CHASE), University of PennsylvaniaThe Center for Health AI and Synthesis of Evidence (CHASE), University of PennsylvaniaThe Center for Health AI and Synthesis of Evidence (CHASE), University of PennsylvaniaDepartment of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of MedicineDepartment of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of MedicineDepartment of Statistics, Faculty of Science, University of AucklandReal World Solutions, IQVIAObservational Health Data Sciences and InformaticsDepartment of Biomedical Informatics, Columbia University Irving Medical CenterDepartment of Biostatistics and Health Data Science, Indiana UniversityDepartment of Biostatistics and Health Data Science, Indiana UniversityDepartment of Population Health Sciences, Weill Cornell MedicineDepartment of Biomedical Informatics, Vanderbilt University Medical CenterDepartment of Electrical and Software Engineering, University of CalgaryClinical and Translational Science Institute, Tufts Medical CenterDepartment of Population Health Sciences, Weill Cornell MedicineDepartment of Biostatistics and Health Data Science, Indiana UniversityDepartment of Biomedical Informatics, Vanderbilt University Medical CenterDepartment of Biomedical Informatics, Columbia University Irving Medical CenterObservational Health Data Sciences and InformaticsCenter for Biologics Evaluation and Research, Food and Drug AdministrationDepartment of Electrical and Software Engineering, University of CalgarySchool of Information, University of MichiganLeonard Davis Institute of Health Economics, University of PennsylvaniaDepartment of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of MedicineAbstract Clinical insights from real-world data often require aggregating information from institutions to ensure sufficient sample sizes and generalizability. However, patient privacy concerns only limit the sharing of patient-level data, and traditional federated learning algorithms, relying on extensive back-and-forth communications, can be inefficient to implement. We introduce the Collaborative One-shot Lossless Algorithm for Generalized Linear Models (COLA-GLM), a novel federated learning algorithm that supports diverse outcome types via generalized linear models and achieves results identical to a pooled patient-level data analysis (lossless) with only a single round of aggregated data exchange (one-shot). To further protect aggregated institutional data, we developed a secure extension, secure-COLA-GLM, utilizing homomorphic encryption. We demonstrated the effectiveness and lossless property of COLA-GLM through applications to an international influenza cohort and a decentralized U.S. COVID-19 mortality study. COLA-GLM and secure-COLA-GLM offer a scalable, efficient solution for decentralized collaborative learning involving multiple data partners and diverse security requirements.https://doi.org/10.1038/s41746-025-01781-1 |
| spellingShingle | Qiong Wu Jenna M. Reps Lu Li Bingyu Zhang Yiwen Lu Jiayi Tong Dazheng Zhang Thomas Lumley Milou T. Brand Mui Van Zandt Thomas Falconer Xing He Yu Huang Haoyang Li Chao Yan Guojun Tang Andrew E. Williams Fei Wang Jiang Bian Bradley Malin George Hripcsak Martijn J. Schuemie Yun Lu Steve Drew Jiayu Zhou David A. Asch Yong Chen COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data npj Digital Medicine |
| title | COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data |
| title_full | COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data |
| title_fullStr | COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data |
| title_full_unstemmed | COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data |
| title_short | COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data |
| title_sort | cola glm collaborative one shot and lossless algorithms of generalized linear models for decentralized observational healthcare data |
| url | https://doi.org/10.1038/s41746-025-01781-1 |
| work_keys_str_mv | AT qiongwu colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT jennamreps colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT luli colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT bingyuzhang colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT yiwenlu colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT jiayitong colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT dazhengzhang colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT thomaslumley colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT miloutbrand colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT muivanzandt colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT thomasfalconer colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT xinghe colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT yuhuang colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT haoyangli colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT chaoyan colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT guojuntang colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT andrewewilliams colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT feiwang colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT jiangbian colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT bradleymalin colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT georgehripcsak colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT martijnjschuemie colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT yunlu colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT stevedrew colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT jiayuzhou colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT davidaasch colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata AT yongchen colaglmcollaborativeoneshotandlosslessalgorithmsofgeneralizedlinearmodelsfordecentralizedobservationalhealthcaredata |