Learning Gaussian graphical models from correlated data

Gaussian Graphical Models (GGMs) are a type of network modeling that uses partial correlation rather than correlation for representing complex relationships among multiple variables. The advantage of using partial correlation is to show the relation between two variables after “adjusting” for the ef...

Full description

Saved in:
Bibliographic Details
Main Authors: Zeyuan Song, Sophia Gunn, Stefano Monti, Gina M. Peloso, Ching-Ti Liu, Kathryn Lunetta, Paola Sebastiani
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-07-01
Series:Frontiers in Systems Biology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fsysb.2025.1589079/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Gaussian Graphical Models (GGMs) are a type of network modeling that uses partial correlation rather than correlation for representing complex relationships among multiple variables. The advantage of using partial correlation is to show the relation between two variables after “adjusting” for the effects of other variables and leads to more parsimonious and interpretable models. There are well established procedures to build GGMs from a sample of independent and identical distributed observations. However, many studies include clustered and longitudinal data that result in correlated observations and ignoring this correlation among observations can lead to inflated Type I error. In this paper, we propose a cluster-based bootstrap algorithm to infer GGMs from correlated data. We use extensive simulations of correlated data from family-based studies to show that the proposed bootstrap method does not inflate the Type I error while retaining statistical power compared to alternative solutions when there are sufficient number of clusters. We apply our method to learn the Gaussian Graphic Model that represents complex relations between 47 Polygenic Risk Scores generated using genome-wide genotype data from the Long Life Family Study. By comparing it to the conventional methods that ignore within-cluster correlation, we show that our method controls the Type I error well without power loss.
ISSN:2674-0702