Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation
Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis an...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2025-01-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/18515.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841544135381614592 |
---|---|
author | Haoqiu Song Saima Sultana Tithi Connor Brown Frank O. Aylward Roderick Jensen Liqing Zhang |
author_facet | Haoqiu Song Saima Sultana Tithi Connor Brown Frank O. Aylward Roderick Jensen Liqing Zhang |
author_sort | Haoqiu Song |
collection | DOAJ |
description | Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality. Virseqimprover first examines whether there is any chimeric sequence based on read coverage, breaks the sequence into segments if there is, then extends the longest segment with uniform depth of coverage, and repeats these procedures until the sequence cannot be extended. Finally, Virseqimprover annotates the gene content of the resulting sequence. Results show that Virseqimprover has good performances on correcting and extending viral contigs to their full lengths, hence can be a useful tool to improve the completeness and minimize the assembly errors of viral contigs. Both a web server and a conda package for Virseqimprover are provided to the research community free of charge. |
format | Article |
id | doaj-art-69d80c90df8e46debdad71a13ff02a06 |
institution | Kabale University |
issn | 2167-8359 |
language | English |
publishDate | 2025-01-01 |
publisher | PeerJ Inc. |
record_format | Article |
series | PeerJ |
spelling | doaj-art-69d80c90df8e46debdad71a13ff02a062025-01-12T15:05:09ZengPeerJ Inc.PeerJ2167-83592025-01-0113e1851510.7717/peerj.18515Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotationHaoqiu Song0Saima Sultana Tithi1Connor Brown2Frank O. Aylward3Roderick Jensen4Liqing Zhang5Department of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of AmericaDepartment of Cell & Molecular Biology, St. Jude Children’s Research Hospital, Memphis, TN, United States of AmericaDepartment of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of AmericaDepartment of Biological Sciences, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of AmericaDepartment of Biological Sciences, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of AmericaDepartment of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of AmericaDespite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality. Virseqimprover first examines whether there is any chimeric sequence based on read coverage, breaks the sequence into segments if there is, then extends the longest segment with uniform depth of coverage, and repeats these procedures until the sequence cannot be extended. Finally, Virseqimprover annotates the gene content of the resulting sequence. Results show that Virseqimprover has good performances on correcting and extending viral contigs to their full lengths, hence can be a useful tool to improve the completeness and minimize the assembly errors of viral contigs. Both a web server and a conda package for Virseqimprover are provided to the research community free of charge.https://peerj.com/articles/18515.pdfMetagenomicsViral genome assemblyViral metagenomics |
spellingShingle | Haoqiu Song Saima Sultana Tithi Connor Brown Frank O. Aylward Roderick Jensen Liqing Zhang Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation PeerJ Metagenomics Viral genome assembly Viral metagenomics |
title | Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation |
title_full | Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation |
title_fullStr | Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation |
title_full_unstemmed | Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation |
title_short | Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation |
title_sort | virseqimprover an integrated pipeline for viral contig error correction extension and annotation |
topic | Metagenomics Viral genome assembly Viral metagenomics |
url | https://peerj.com/articles/18515.pdf |
work_keys_str_mv | AT haoqiusong virseqimproveranintegratedpipelineforviralcontigerrorcorrectionextensionandannotation AT saimasultanatithi virseqimproveranintegratedpipelineforviralcontigerrorcorrectionextensionandannotation AT connorbrown virseqimproveranintegratedpipelineforviralcontigerrorcorrectionextensionandannotation AT frankoaylward virseqimproveranintegratedpipelineforviralcontigerrorcorrectionextensionandannotation AT roderickjensen virseqimproveranintegratedpipelineforviralcontigerrorcorrectionextensionandannotation AT liqingzhang virseqimproveranintegratedpipelineforviralcontigerrorcorrectionextensionandannotation |