SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.

A Directed Acyclic Graph (DAG) offers an easy approach to define causal structures among gathered nodes: causal linkages are represented by arrows between the variables, leading from cause to effect. Recently, industry and academics have paid close attention to DAG structure learning from observable...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mario Grassi, Barbara Tarantino
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2025-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0317283
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841533170269290496
author	Mario Grassi Barbara Tarantino
author_facet	Mario Grassi Barbara Tarantino
author_sort	Mario Grassi
collection	DOAJ
description	A Directed Acyclic Graph (DAG) offers an easy approach to define causal structures among gathered nodes: causal linkages are represented by arrows between the variables, leading from cause to effect. Recently, industry and academics have paid close attention to DAG structure learning from observable data, and many techniques have been put out to address the problem. We provide a two-step approach, named SEMdag(), that can be used to quickly learn high-dimensional linear SEMs. It is included in the R package SEMgraph and employs a two-stage order-based search using previous knowledge (Knowledge-based, KB) or data-driven method (Bottom-up, BU), under the premise that a linear SEM with equal variance error terms is assumed. We evaluated our framework's for finding plausible DAGs against six well-known causal discovery techniques (ARGES, GES, PC, LiNGAM, CAM, NOTEARS). We conducted a series of experiments using observed expression (or RNA-seq) data, taking into account a pair of training and testing datasets for four distinct diseases: Amyotrophic Lateral Sclerosis (ALS), Breast cancer (BRCA), Coronavirus disease (COVID-19) and ST-elevation myocardial infarction (STEMI). The results show that the SEMdag() procedure can recover a graph structure with good disease prediction performance evaluated by a conventional supervised learning algorithm (RF): in the scenario where the initial graph is sparse, the BU approach may be a better choice than the KB one; in the case where the graph is denser, both BU an KB report high performance, with highest score for KB approach based on topological layers. Besides its superior disease predictive performance compared to previous research, SEMdag() offers the user the flexibility to define distinct structure learning algorithms and can handle high dimensional issues with less computing load. SEMdag() function is implemented in the R package SEMgraph, easily available at https://CRAN.R-project.org/package=SEMgraph.
format	Article
id	doaj-art-7cb1c6dfdbe141aca466db2eeaf64078
institution	Kabale University
issn	1932-6203
language	English
publishDate	2025-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj-art-7cb1c6dfdbe141aca466db2eeaf640782025-01-17T05:31:34ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e031728310.1371/journal.pone.0317283SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.Mario GrassiBarbara TarantinoA Directed Acyclic Graph (DAG) offers an easy approach to define causal structures among gathered nodes: causal linkages are represented by arrows between the variables, leading from cause to effect. Recently, industry and academics have paid close attention to DAG structure learning from observable data, and many techniques have been put out to address the problem. We provide a two-step approach, named SEMdag(), that can be used to quickly learn high-dimensional linear SEMs. It is included in the R package SEMgraph and employs a two-stage order-based search using previous knowledge (Knowledge-based, KB) or data-driven method (Bottom-up, BU), under the premise that a linear SEM with equal variance error terms is assumed. We evaluated our framework's for finding plausible DAGs against six well-known causal discovery techniques (ARGES, GES, PC, LiNGAM, CAM, NOTEARS). We conducted a series of experiments using observed expression (or RNA-seq) data, taking into account a pair of training and testing datasets for four distinct diseases: Amyotrophic Lateral Sclerosis (ALS), Breast cancer (BRCA), Coronavirus disease (COVID-19) and ST-elevation myocardial infarction (STEMI). The results show that the SEMdag() procedure can recover a graph structure with good disease prediction performance evaluated by a conventional supervised learning algorithm (RF): in the scenario where the initial graph is sparse, the BU approach may be a better choice than the KB one; in the case where the graph is denser, both BU an KB report high performance, with highest score for KB approach based on topological layers. Besides its superior disease predictive performance compared to previous research, SEMdag() offers the user the flexibility to define distinct structure learning algorithms and can handle high dimensional issues with less computing load. SEMdag() function is implemented in the R package SEMgraph, easily available at https://CRAN.R-project.org/package=SEMgraph.https://doi.org/10.1371/journal.pone.0317283
spellingShingle	Mario Grassi Barbara Tarantino SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering. PLoS ONE
title	SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.
title_full	SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.
title_fullStr	SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.
title_full_unstemmed	SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.
title_short	SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.
title_sort	semdag fast learning of directed acyclic graphs via node or layer ordering
url	https://doi.org/10.1371/journal.pone.0317283
work_keys_str_mv	AT mariograssi semdagfastlearningofdirectedacyclicgraphsvianodeorlayerordering AT barbaratarantino semdagfastlearningofdirectedacyclicgraphsvianodeorlayerordering

SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.

Similar Items