A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data

Abstract Background Our research focuses on local-level estimation of the effective reproductive number, which describes the transmissibility of an infectious disease and represents the average number of individuals one infectious person infects at a given time. The ability to accurately estimate th...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Sakhawat Hossain, Ravi Goyal, Natasha K. Martin, Victor DeGruttola, Mohammad Mihrab Chowdhury, Christopher McMahan, Lior Rennert
Format: Article
Language:English
Published: BMC 2025-03-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-025-02525-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849238411083251712
author Md Sakhawat Hossain
Ravi Goyal
Natasha K. Martin
Victor DeGruttola
Mohammad Mihrab Chowdhury
Christopher McMahan
Lior Rennert
author_facet Md Sakhawat Hossain
Ravi Goyal
Natasha K. Martin
Victor DeGruttola
Mohammad Mihrab Chowdhury
Christopher McMahan
Lior Rennert
author_sort Md Sakhawat Hossain
collection DOAJ
description Abstract Background Our research focuses on local-level estimation of the effective reproductive number, which describes the transmissibility of an infectious disease and represents the average number of individuals one infectious person infects at a given time. The ability to accurately estimate the infectious disease reproductive number in geographically granular regions is critical for disaster planning and resource allocation. However, not all regions have sufficient infectious disease outcome data; this lack of data presents a significant challenge for accurate estimation. Methods To overcome this challenge, we propose a two-step approach that incorporates existing $$\:{R}_{t}$$ estimation procedures (EpiEstim, EpiFilter, EpiNow2) using data from geographic regions with sufficient data (step 1), into a covariate-adjusted Bayesian Integrated Nested Laplace Approximation (INLA) spatial model to predict $$\:{R}_{t}$$ in regions with sparse or missing data (step 2). Our flexible framework effectively allows us to implement any existing estimation procedure for $$\:{R}_{t}$$ in regions with coarse or entirely missing data. We perform external validation and a simulation study to evaluate the proposed method and assess its predictive performance. Results We applied our method to estimate $$\:{R}_{t}\:$$ using data from South Carolina (SC) counties and ZIP codes during the first COVID-19 wave (‘Wave 1’, June 16, 2020 – August 31, 2020) and the second wave (‘Wave 2’, December 16, 2020 – March 02, 2021). Among the three methods used in the first step, EpiNow2 yielded the highest accuracy of $$\:{R}_{t}$$ prediction in the regions with entirely missing data. Median county-level percentage agreement (PA) was 90.9% (Interquartile Range, IQR: 89.9–92.0%) and 92.5% (IQR: 91.6–93.4%) for Wave 1 and 2, respectively. Median zip code-level PA was 95.2% (IQR: 94.4–95.7%) and 96.5% (IQR: 95.8–97.1%) for Wave 1 and 2, respectively. Using EpiEstim, EpiFilter, and an ensemble-based approach yielded median PA ranging from 81.9 to 90.0%, 87.2-92.1%, and 88.4-90.9%, respectively, across both waves and geographic granularities. Conclusion These findings demonstrate that the proposed methodology is a useful tool for small-area estimation of $$\:{R}_{t}$$ , as our flexible framework yields high prediction accuracy for regions with coarse or missing data.
format Article
id doaj-art-cad13b02dfda4bd2918fb2cb43d0e221
institution Kabale University
issn 1471-2288
language English
publishDate 2025-03-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj-art-cad13b02dfda4bd2918fb2cb43d0e2212025-08-20T04:01:36ZengBMCBMC Medical Research Methodology1471-22882025-03-0125111110.1186/s12874-025-02525-1A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse dataMd Sakhawat Hossain0Ravi Goyal1Natasha K. Martin2Victor DeGruttola3Mohammad Mihrab Chowdhury4Christopher McMahan5Lior Rennert6Department of Public Health Sciences, Clemson UniversityDivision of Infectious Diseases & Global Public Health, University of California San DiegoDivision of Infectious Diseases & Global Public Health, University of California San DiegoDivision of Biostatistics, Herbert Wertheim School of Public Health and Longevity Science, University of California San DiegoDepartment of Public Health Sciences, Clemson UniversityCenter for Public Health Modeling and Response, Clemson UniversityDepartment of Public Health Sciences, Clemson UniversityAbstract Background Our research focuses on local-level estimation of the effective reproductive number, which describes the transmissibility of an infectious disease and represents the average number of individuals one infectious person infects at a given time. The ability to accurately estimate the infectious disease reproductive number in geographically granular regions is critical for disaster planning and resource allocation. However, not all regions have sufficient infectious disease outcome data; this lack of data presents a significant challenge for accurate estimation. Methods To overcome this challenge, we propose a two-step approach that incorporates existing $$\:{R}_{t}$$ estimation procedures (EpiEstim, EpiFilter, EpiNow2) using data from geographic regions with sufficient data (step 1), into a covariate-adjusted Bayesian Integrated Nested Laplace Approximation (INLA) spatial model to predict $$\:{R}_{t}$$ in regions with sparse or missing data (step 2). Our flexible framework effectively allows us to implement any existing estimation procedure for $$\:{R}_{t}$$ in regions with coarse or entirely missing data. We perform external validation and a simulation study to evaluate the proposed method and assess its predictive performance. Results We applied our method to estimate $$\:{R}_{t}\:$$ using data from South Carolina (SC) counties and ZIP codes during the first COVID-19 wave (‘Wave 1’, June 16, 2020 – August 31, 2020) and the second wave (‘Wave 2’, December 16, 2020 – March 02, 2021). Among the three methods used in the first step, EpiNow2 yielded the highest accuracy of $$\:{R}_{t}$$ prediction in the regions with entirely missing data. Median county-level percentage agreement (PA) was 90.9% (Interquartile Range, IQR: 89.9–92.0%) and 92.5% (IQR: 91.6–93.4%) for Wave 1 and 2, respectively. Median zip code-level PA was 95.2% (IQR: 94.4–95.7%) and 96.5% (IQR: 95.8–97.1%) for Wave 1 and 2, respectively. Using EpiEstim, EpiFilter, and an ensemble-based approach yielded median PA ranging from 81.9 to 90.0%, 87.2-92.1%, and 88.4-90.9%, respectively, across both waves and geographic granularities. Conclusion These findings demonstrate that the proposed methodology is a useful tool for small-area estimation of $$\:{R}_{t}$$ , as our flexible framework yields high prediction accuracy for regions with coarse or missing data.https://doi.org/10.1186/s12874-025-02525-1Infectious disease modelingBayesian statisticsEffective reproductive numberSmall area estimationPrediction
spellingShingle Md Sakhawat Hossain
Ravi Goyal
Natasha K. Martin
Victor DeGruttola
Mohammad Mihrab Chowdhury
Christopher McMahan
Lior Rennert
A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data
BMC Medical Research Methodology
Infectious disease modeling
Bayesian statistics
Effective reproductive number
Small area estimation
Prediction
title A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data
title_full A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data
title_fullStr A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data
title_full_unstemmed A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data
title_short A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data
title_sort flexible framework for local level estimation of the effective reproductive number in geographic regions with sparse data
topic Infectious disease modeling
Bayesian statistics
Effective reproductive number
Small area estimation
Prediction
url https://doi.org/10.1186/s12874-025-02525-1
work_keys_str_mv AT mdsakhawathossain aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT ravigoyal aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT natashakmartin aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT victordegruttola aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT mohammadmihrabchowdhury aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT christophermcmahan aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT liorrennert aflexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT mdsakhawathossain flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT ravigoyal flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT natashakmartin flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT victordegruttola flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT mohammadmihrabchowdhury flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT christophermcmahan flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata
AT liorrennert flexibleframeworkforlocallevelestimationoftheeffectivereproductivenumberingeographicregionswithsparsedata