dTOURS: Dense-region tagging for outbreak detection using ratio statistics.
Surveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyz...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0322663 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850265861894438912 |
|---|---|
| author | Lukas Wagner Richa Agarwala |
| author_facet | Lukas Wagner Richa Agarwala |
| author_sort | Lukas Wagner |
| collection | DOAJ |
| description | Surveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyzed, and results published publicly by an automated pathogen detection pipeline at the National Center for Biotechnology Information. The pipeline detects close isolates with a recent common ancestor by finding single nucleotide polymorphisms (SNPs) in genomes for pairs of isolates. Very few vertically transmitted SNPs are expected between a pair of close isolates; any genomic region with many SNPs compared to the number of SNPs in the rest of the genome is indicative of a horizontally transferred region that needs to be excluded for counting vertically transmitted SNPs. We developed dTOURS that adapted the ratio statistic for finding outliers to the problem of finding regions of high SNP density in a pair of genomes where isolates typically have fragmented genome assemblies. Simulations for deciding the dTOURS parameter are presented. We illustrate correctness of dTOURS using five published outbreaks, one each for five bacterial species that cause many foodborne outbreaks or lead to a high mortality rate. Comparison to Gubbins shows that while both Gubbins and dTOURS use the ratio statistic, the implementation in dTOURS is more robust for finding close isolates in outbreak analysis. Comparison with the method used by the Food and Drug Administration shows that their method is simple and fast but not sensitive. |
| format | Article |
| id | doaj-art-3d5c503b4fcf461e90dacaece345905b |
| institution | OA Journals |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-3d5c503b4fcf461e90dacaece345905b2025-08-20T01:54:19ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032266310.1371/journal.pone.0322663dTOURS: Dense-region tagging for outbreak detection using ratio statistics.Lukas WagnerRicha AgarwalaSurveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyzed, and results published publicly by an automated pathogen detection pipeline at the National Center for Biotechnology Information. The pipeline detects close isolates with a recent common ancestor by finding single nucleotide polymorphisms (SNPs) in genomes for pairs of isolates. Very few vertically transmitted SNPs are expected between a pair of close isolates; any genomic region with many SNPs compared to the number of SNPs in the rest of the genome is indicative of a horizontally transferred region that needs to be excluded for counting vertically transmitted SNPs. We developed dTOURS that adapted the ratio statistic for finding outliers to the problem of finding regions of high SNP density in a pair of genomes where isolates typically have fragmented genome assemblies. Simulations for deciding the dTOURS parameter are presented. We illustrate correctness of dTOURS using five published outbreaks, one each for five bacterial species that cause many foodborne outbreaks or lead to a high mortality rate. Comparison to Gubbins shows that while both Gubbins and dTOURS use the ratio statistic, the implementation in dTOURS is more robust for finding close isolates in outbreak analysis. Comparison with the method used by the Food and Drug Administration shows that their method is simple and fast but not sensitive.https://doi.org/10.1371/journal.pone.0322663 |
| spellingShingle | Lukas Wagner Richa Agarwala dTOURS: Dense-region tagging for outbreak detection using ratio statistics. PLoS ONE |
| title | dTOURS: Dense-region tagging for outbreak detection using ratio statistics. |
| title_full | dTOURS: Dense-region tagging for outbreak detection using ratio statistics. |
| title_fullStr | dTOURS: Dense-region tagging for outbreak detection using ratio statistics. |
| title_full_unstemmed | dTOURS: Dense-region tagging for outbreak detection using ratio statistics. |
| title_short | dTOURS: Dense-region tagging for outbreak detection using ratio statistics. |
| title_sort | dtours dense region tagging for outbreak detection using ratio statistics |
| url | https://doi.org/10.1371/journal.pone.0322663 |
| work_keys_str_mv | AT lukaswagner dtoursdenseregiontaggingforoutbreakdetectionusingratiostatistics AT richaagarwala dtoursdenseregiontaggingforoutbreakdetectionusingratiostatistics |