Sama: a contig assembler with correctness guarantee
Abstract Background: In genome assembly the task is to reconstruct a genome based on sequencing reads. Current practical methods are based on heuristics which are hard to analyse and thus such analysis is not readily available. Results: We present a model for estimating the probability of misassembl...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-06-01
|
| Series: | Algorithms for Molecular Biology |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13015-025-00280-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Background: In genome assembly the task is to reconstruct a genome based on sequencing reads. Current practical methods are based on heuristics which are hard to analyse and thus such analysis is not readily available. Results: We present a model for estimating the probability of misassembly at each position of a de Bruijn graph based assembly. Unlike previous work, our model also takes into account missing data. We apply our model to produce contigs with correctness guarantee and correctness estimates for each position in the contigs. Conclusions: Our experiments show that when the coverage of k-mers is high enough, our method produces contigs with similar contiguity characteristics as state-of-the-art assemblers which are based on heuristic correction of the de Bruijn graph. Our model may have further applications in downstream analysis of contigs or in any analysis working directly on the de Bruijn graph. |
|---|---|
| ISSN: | 1748-7188 |