Flowtigs: Safety in flow decompositions for assembly graphs
Summary: A decomposition of a network flow is a set of weighted walks whose superposition equals the flow. In this article, we give a simple and linear-time-verifiable complete characterization (flowtigs) of walks that are safe in such general flow decompositions, i.e., that are subwalks of any poss...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | iScience |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2589004224024337 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850121439327289344 |
|---|---|
| author | Francisco Sena Eliel Ingervo Shahbaz Khan Andrey Prjibelski Sebastian Schmidt Alexandru Tomescu |
| author_facet | Francisco Sena Eliel Ingervo Shahbaz Khan Andrey Prjibelski Sebastian Schmidt Alexandru Tomescu |
| author_sort | Francisco Sena |
| collection | DOAJ |
| description | Summary: A decomposition of a network flow is a set of weighted walks whose superposition equals the flow. In this article, we give a simple and linear-time-verifiable complete characterization (flowtigs) of walks that are safe in such general flow decompositions, i.e., that are subwalks of any possible flow decomposition. We provide an O(mn)-time algorithm that identifies all maximal flowtigs and represents them inside a compact structure. On the practical side, we study flowtigs in the use-case of metagenomic assembly. By using the species abundances as flow values of the metagenomic assembly graph, we can model the possible assembly solutions as flow decompositions into weighted closed walks. On simulated data, compared to reporting unitigs or maximal safe walks based only on the graph structure, reporting flowtigs results in a notably more contiguous assembly. On real data, we frame flowtigs as a heuristic and provide an algorithm that is guided by this heuristic. |
| format | Article |
| id | doaj-art-cfd96188a2914dc990baaee4d8e2fece |
| institution | OA Journals |
| issn | 2589-0042 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | iScience |
| spelling | doaj-art-cfd96188a2914dc990baaee4d8e2fece2025-08-20T02:35:05ZengElsevieriScience2589-00422024-12-01271211120810.1016/j.isci.2024.111208Flowtigs: Safety in flow decompositions for assembly graphsFrancisco Sena0Eliel Ingervo1Shahbaz Khan2Andrey Prjibelski3Sebastian Schmidt4Alexandru Tomescu5University of Helsinki, Helsinki, Finland; Corresponding authorUniversity of Helsinki, Helsinki, FinlandIndian Institute of Technology Roorkee, Roorkee, IndiaUniversity of Helsinki, Helsinki, FinlandUniversity of Helsinki, Helsinki, FinlandUniversity of Helsinki, Helsinki, Finland; Corresponding authorSummary: A decomposition of a network flow is a set of weighted walks whose superposition equals the flow. In this article, we give a simple and linear-time-verifiable complete characterization (flowtigs) of walks that are safe in such general flow decompositions, i.e., that are subwalks of any possible flow decomposition. We provide an O(mn)-time algorithm that identifies all maximal flowtigs and represents them inside a compact structure. On the practical side, we study flowtigs in the use-case of metagenomic assembly. By using the species abundances as flow values of the metagenomic assembly graph, we can model the possible assembly solutions as flow decompositions into weighted closed walks. On simulated data, compared to reporting unitigs or maximal safe walks based only on the graph structure, reporting flowtigs results in a notably more contiguous assembly. On real data, we frame flowtigs as a heuristic and provide an algorithm that is guided by this heuristic.http://www.sciencedirect.com/science/article/pii/S2589004224024337Microbial genomicsBiocomputational methodClassification of bioinformatical subjectGenomic analysis |
| spellingShingle | Francisco Sena Eliel Ingervo Shahbaz Khan Andrey Prjibelski Sebastian Schmidt Alexandru Tomescu Flowtigs: Safety in flow decompositions for assembly graphs iScience Microbial genomics Biocomputational method Classification of bioinformatical subject Genomic analysis |
| title | Flowtigs: Safety in flow decompositions for assembly graphs |
| title_full | Flowtigs: Safety in flow decompositions for assembly graphs |
| title_fullStr | Flowtigs: Safety in flow decompositions for assembly graphs |
| title_full_unstemmed | Flowtigs: Safety in flow decompositions for assembly graphs |
| title_short | Flowtigs: Safety in flow decompositions for assembly graphs |
| title_sort | flowtigs safety in flow decompositions for assembly graphs |
| topic | Microbial genomics Biocomputational method Classification of bioinformatical subject Genomic analysis |
| url | http://www.sciencedirect.com/science/article/pii/S2589004224024337 |
| work_keys_str_mv | AT franciscosena flowtigssafetyinflowdecompositionsforassemblygraphs AT elielingervo flowtigssafetyinflowdecompositionsforassemblygraphs AT shahbazkhan flowtigssafetyinflowdecompositionsforassemblygraphs AT andreyprjibelski flowtigssafetyinflowdecompositionsforassemblygraphs AT sebastianschmidt flowtigssafetyinflowdecompositionsforassemblygraphs AT alexandrutomescu flowtigssafetyinflowdecompositionsforassemblygraphs |