DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research

NGS sequencing data are expanding exponentially, accompanied by a concomitant growth in non-target species contamination. Meanwhile, these seemingly undesirable sequences can actually provide valuable insights into the broad-scale diversity and distribution of their parasites or symbionts. In this s...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiazheng Xie, Yu Zhang, Lina Wang, Yuting Deng
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Diversity
Subjects:
Online Access:https://www.mdpi.com/1424-2818/17/3/186
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849342969412321280
author Jiazheng Xie
Yu Zhang
Lina Wang
Yuting Deng
author_facet Jiazheng Xie
Yu Zhang
Lina Wang
Yuting Deng
author_sort Jiazheng Xie
collection DOAJ
description NGS sequencing data are expanding exponentially, accompanied by a concomitant growth in non-target species contamination. Meanwhile, these seemingly undesirable sequences can actually provide valuable insights into the broad-scale diversity and distribution of their parasites or symbionts. In this study, we developed a pipeline called DBCscreen (DNA Barcode Contamination screen) to explore biodiversity and distribution across a broad range of living organisms, based on a DNA barcode contamination survey. We used DBCscreen to screen 39,302 eukaryotic assemblies in the NCBI TSA/WGS database, and after stringent filtering, we ultimately identified 110,880 contaminated contigs related to DNA barcodes in 10,717 assemblies. Subsequently, the taxonomic information of these contaminants was determined, and their heterogeneous distribution patterns revealed complex relationships between the hosts (assembly source) and their associated parasites or symbionts (contaminants). Finally, several application examples demonstrating the use of DBCscreen were described, such as identification of the most easily contaminated organisms associated with a specific host (ex. ticks), as well as the specification of which hosts are particularly prone to certain types of contamination (ex. <i>Wolbachia</i> and nematodes).
format Article
id doaj-art-332371816d8e4e0780b93eb7f80ef672
institution Kabale University
issn 1424-2818
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Diversity
spelling doaj-art-332371816d8e4e0780b93eb7f80ef6722025-08-20T03:43:11ZengMDPI AGDiversity1424-28182025-03-0117318610.3390/d17030186DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity ResearchJiazheng Xie0Yu Zhang1Lina Wang2Yuting Deng3Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaChongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaChongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaChongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaNGS sequencing data are expanding exponentially, accompanied by a concomitant growth in non-target species contamination. Meanwhile, these seemingly undesirable sequences can actually provide valuable insights into the broad-scale diversity and distribution of their parasites or symbionts. In this study, we developed a pipeline called DBCscreen (DNA Barcode Contamination screen) to explore biodiversity and distribution across a broad range of living organisms, based on a DNA barcode contamination survey. We used DBCscreen to screen 39,302 eukaryotic assemblies in the NCBI TSA/WGS database, and after stringent filtering, we ultimately identified 110,880 contaminated contigs related to DNA barcodes in 10,717 assemblies. Subsequently, the taxonomic information of these contaminants was determined, and their heterogeneous distribution patterns revealed complex relationships between the hosts (assembly source) and their associated parasites or symbionts (contaminants). Finally, several application examples demonstrating the use of DBCscreen were described, such as identification of the most easily contaminated organisms associated with a specific host (ex. ticks), as well as the specification of which hosts are particularly prone to certain types of contamination (ex. <i>Wolbachia</i> and nematodes).https://www.mdpi.com/1424-2818/17/3/186DNA barcodecontaminationparasitessymbiontsnematode<i>Wolbachia</i>
spellingShingle Jiazheng Xie
Yu Zhang
Lina Wang
Yuting Deng
DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research
Diversity
DNA barcode
contamination
parasites
symbionts
nematode
<i>Wolbachia</i>
title DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research
title_full DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research
title_fullStr DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research
title_full_unstemmed DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research
title_short DNA Barcode Contamination Screen (DBCscreen): A Pipeline to Rapidly Detect DNA Barcode Contamination for Biodiversity Research
title_sort dna barcode contamination screen dbcscreen a pipeline to rapidly detect dna barcode contamination for biodiversity research
topic DNA barcode
contamination
parasites
symbionts
nematode
<i>Wolbachia</i>
url https://www.mdpi.com/1424-2818/17/3/186
work_keys_str_mv AT jiazhengxie dnabarcodecontaminationscreendbcscreenapipelinetorapidlydetectdnabarcodecontaminationforbiodiversityresearch
AT yuzhang dnabarcodecontaminationscreendbcscreenapipelinetorapidlydetectdnabarcodecontaminationforbiodiversityresearch
AT linawang dnabarcodecontaminationscreendbcscreenapipelinetorapidlydetectdnabarcodecontaminationforbiodiversityresearch
AT yutingdeng dnabarcodecontaminationscreendbcscreenapipelinetorapidlydetectdnabarcodecontaminationforbiodiversityresearch