Measuring quality of DNA sequence data via degradation.
We formulate and apply a novel paradigm for characterization of genome data quality, which quantifies the effects of intentional degradation of quality. The rationale is that the higher the initial quality, the more fragile the genome and the greater the effects of degradation. We demonstrate that t...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2022-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0271970&type=printable |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850023507601129472 |
|---|---|
| author | Alan F Karr Jason Hauzel Adam A Porter Marcel Schaefer |
| author_facet | Alan F Karr Jason Hauzel Adam A Porter Marcel Schaefer |
| author_sort | Alan F Karr |
| collection | DOAJ |
| description | We formulate and apply a novel paradigm for characterization of genome data quality, which quantifies the effects of intentional degradation of quality. The rationale is that the higher the initial quality, the more fragile the genome and the greater the effects of degradation. We demonstrate that this phenomenon is ubiquitous, and that quantified measures of degradation can be used for multiple purposes, illustrated by outlier detection. We focus on identifying outliers that may be problematic with respect to data quality, but might also be true anomalies or even attempts to subvert the database. |
| format | Article |
| id | doaj-art-be2a08f11c7c46ccab3e23161ca7d5c8 |
| institution | DOAJ |
| issn | 1932-6203 |
| language | English |
| publishDate | 2022-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-be2a08f11c7c46ccab3e23161ca7d5c82025-08-20T03:01:21ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-01178e027197010.1371/journal.pone.0271970Measuring quality of DNA sequence data via degradation.Alan F KarrJason HauzelAdam A PorterMarcel SchaeferWe formulate and apply a novel paradigm for characterization of genome data quality, which quantifies the effects of intentional degradation of quality. The rationale is that the higher the initial quality, the more fragile the genome and the greater the effects of degradation. We demonstrate that this phenomenon is ubiquitous, and that quantified measures of degradation can be used for multiple purposes, illustrated by outlier detection. We focus on identifying outliers that may be problematic with respect to data quality, but might also be true anomalies or even attempts to subvert the database.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0271970&type=printable |
| spellingShingle | Alan F Karr Jason Hauzel Adam A Porter Marcel Schaefer Measuring quality of DNA sequence data via degradation. PLoS ONE |
| title | Measuring quality of DNA sequence data via degradation. |
| title_full | Measuring quality of DNA sequence data via degradation. |
| title_fullStr | Measuring quality of DNA sequence data via degradation. |
| title_full_unstemmed | Measuring quality of DNA sequence data via degradation. |
| title_short | Measuring quality of DNA sequence data via degradation. |
| title_sort | measuring quality of dna sequence data via degradation |
| url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0271970&type=printable |
| work_keys_str_mv | AT alanfkarr measuringqualityofdnasequencedataviadegradation AT jasonhauzel measuringqualityofdnasequencedataviadegradation AT adamaporter measuringqualityofdnasequencedataviadegradation AT marcelschaefer measuringqualityofdnasequencedataviadegradation |