dTOURS: Dense-region tagging for outbreak detection using ratio statistics.

Surveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyz...

Full description

Saved in:
Bibliographic Details
Main Authors: Lukas Wagner, Richa Agarwala
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0322663
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850265861894438912
author Lukas Wagner
Richa Agarwala
author_facet Lukas Wagner
Richa Agarwala
author_sort Lukas Wagner
collection DOAJ
description Surveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyzed, and results published publicly by an automated pathogen detection pipeline at the National Center for Biotechnology Information. The pipeline detects close isolates with a recent common ancestor by finding single nucleotide polymorphisms (SNPs) in genomes for pairs of isolates. Very few vertically transmitted SNPs are expected between a pair of close isolates; any genomic region with many SNPs compared to the number of SNPs in the rest of the genome is indicative of a horizontally transferred region that needs to be excluded for counting vertically transmitted SNPs. We developed dTOURS that adapted the ratio statistic for finding outliers to the problem of finding regions of high SNP density in a pair of genomes where isolates typically have fragmented genome assemblies. Simulations for deciding the dTOURS parameter are presented. We illustrate correctness of dTOURS using five published outbreaks, one each for five bacterial species that cause many foodborne outbreaks or lead to a high mortality rate. Comparison to Gubbins shows that while both Gubbins and dTOURS use the ratio statistic, the implementation in dTOURS is more robust for finding close isolates in outbreak analysis. Comparison with the method used by the Food and Drug Administration shows that their method is simple and fast but not sensitive.
format Article
id doaj-art-3d5c503b4fcf461e90dacaece345905b
institution OA Journals
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-3d5c503b4fcf461e90dacaece345905b2025-08-20T01:54:19ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032266310.1371/journal.pone.0322663dTOURS: Dense-region tagging for outbreak detection using ratio statistics.Lukas WagnerRicha AgarwalaSurveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyzed, and results published publicly by an automated pathogen detection pipeline at the National Center for Biotechnology Information. The pipeline detects close isolates with a recent common ancestor by finding single nucleotide polymorphisms (SNPs) in genomes for pairs of isolates. Very few vertically transmitted SNPs are expected between a pair of close isolates; any genomic region with many SNPs compared to the number of SNPs in the rest of the genome is indicative of a horizontally transferred region that needs to be excluded for counting vertically transmitted SNPs. We developed dTOURS that adapted the ratio statistic for finding outliers to the problem of finding regions of high SNP density in a pair of genomes where isolates typically have fragmented genome assemblies. Simulations for deciding the dTOURS parameter are presented. We illustrate correctness of dTOURS using five published outbreaks, one each for five bacterial species that cause many foodborne outbreaks or lead to a high mortality rate. Comparison to Gubbins shows that while both Gubbins and dTOURS use the ratio statistic, the implementation in dTOURS is more robust for finding close isolates in outbreak analysis. Comparison with the method used by the Food and Drug Administration shows that their method is simple and fast but not sensitive.https://doi.org/10.1371/journal.pone.0322663
spellingShingle Lukas Wagner
Richa Agarwala
dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
PLoS ONE
title dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
title_full dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
title_fullStr dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
title_full_unstemmed dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
title_short dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
title_sort dtours dense region tagging for outbreak detection using ratio statistics
url https://doi.org/10.1371/journal.pone.0322663
work_keys_str_mv AT lukaswagner dtoursdenseregiontaggingforoutbreakdetectionusingratiostatistics
AT richaagarwala dtoursdenseregiontaggingforoutbreakdetectionusingratiostatistics