Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.

The COVID-19 pandemic demonstrated that fast and accurate analysis of continually collected infectious disease surveillance data is crucial for situational awareness and policy making. Coalescent-based phylodynamic analysis can use genetic sequences of a pathogen to estimate changes in its effective...

Full description

Saved in:
Bibliographic Details
Main Authors: Catalina M Medina, Julia A Palacios, Volodymyr M Minin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-05-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012970
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849689724899295232
author Catalina M Medina
Julia A Palacios
Volodymyr M Minin
author_facet Catalina M Medina
Julia A Palacios
Volodymyr M Minin
author_sort Catalina M Medina
collection DOAJ
description The COVID-19 pandemic demonstrated that fast and accurate analysis of continually collected infectious disease surveillance data is crucial for situational awareness and policy making. Coalescent-based phylodynamic analysis can use genetic sequences of a pathogen to estimate changes in its effective population size, a measure of genetic diversity. These changes in effective population size can be connected to the changes in the number of infections in the population of interest under certain conditions. Phylodynamics is an important set of tools because its methods are often resilient to the ascertainment biases present in traditional surveillance data (e.g., preferentially testing symptomatic individuals). Unfortunately, it takes weeks or months to sequence and deposit the sampled pathogen genetic sequences into a database, making them available for such analyses. These reporting delays severely decrease precision of phylodynamic methods closer to present time, and for some models can lead to extreme biases. Here we present a method that affords reliable estimation of the effective population size trajectory closer to the time of data collection, allowing for policy decisions to be based on more recent data. Our work uses readily available historic times between sampling and reporting of sequenced samples for a population of interest, and incorporates this information into the sampling model to mitigate the effects of reporting delay in real-time analyses. We illustrate our methodology on simulated data and on SARS-CoV-2 sequences collected in the state of Washington in 2021.
format Article
id doaj-art-5d9f4241a80b48e99fff206aa79a592d
institution DOAJ
issn 1553-734X
1553-7358
language English
publishDate 2025-05-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-5d9f4241a80b48e99fff206aa79a592d2025-08-20T03:21:31ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-05-01215e101297010.1371/journal.pcbi.1012970Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.Catalina M MedinaJulia A PalaciosVolodymyr M MininThe COVID-19 pandemic demonstrated that fast and accurate analysis of continually collected infectious disease surveillance data is crucial for situational awareness and policy making. Coalescent-based phylodynamic analysis can use genetic sequences of a pathogen to estimate changes in its effective population size, a measure of genetic diversity. These changes in effective population size can be connected to the changes in the number of infections in the population of interest under certain conditions. Phylodynamics is an important set of tools because its methods are often resilient to the ascertainment biases present in traditional surveillance data (e.g., preferentially testing symptomatic individuals). Unfortunately, it takes weeks or months to sequence and deposit the sampled pathogen genetic sequences into a database, making them available for such analyses. These reporting delays severely decrease precision of phylodynamic methods closer to present time, and for some models can lead to extreme biases. Here we present a method that affords reliable estimation of the effective population size trajectory closer to the time of data collection, allowing for policy decisions to be based on more recent data. Our work uses readily available historic times between sampling and reporting of sequenced samples for a population of interest, and incorporates this information into the sampling model to mitigate the effects of reporting delay in real-time analyses. We illustrate our methodology on simulated data and on SARS-CoV-2 sequences collected in the state of Washington in 2021.https://doi.org/10.1371/journal.pcbi.1012970
spellingShingle Catalina M Medina
Julia A Palacios
Volodymyr M Minin
Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.
PLoS Computational Biology
title Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.
title_full Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.
title_fullStr Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.
title_full_unstemmed Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.
title_short Accounting for reporting delays in real-time phylodynamic analyses with preferential sampling.
title_sort accounting for reporting delays in real time phylodynamic analyses with preferential sampling
url https://doi.org/10.1371/journal.pcbi.1012970
work_keys_str_mv AT catalinammedina accountingforreportingdelaysinrealtimephylodynamicanalyseswithpreferentialsampling
AT juliaapalacios accountingforreportingdelaysinrealtimephylodynamicanalyseswithpreferentialsampling
AT volodymyrmminin accountingforreportingdelaysinrealtimephylodynamicanalyseswithpreferentialsampling