Predicting communities with high tuberculosis case-finding efficiency to optimise resource allocation in Pakistan: comparing the performance of a negative binomial spatial lag model with a Bayesian machine-learning model
Introduction Despite progress in tuberculosis (TB) treatment coverage in past years, an estimated 183 000 people with TB may not have been diagnosed in Pakistan in 2022. Therefore, there is a need to develop models which help to steer active case finding (ACF) towards populations with a high probabi...
Saved in:
| Main Authors: | , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMJ Publishing Group
2025-05-01
|
| Series: | BMJ Public Health |
| Online Access: | https://bmjpublichealth.bmj.com/content/3/1/e001424.full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Introduction Despite progress in tuberculosis (TB) treatment coverage in past years, an estimated 183 000 people with TB may not have been diagnosed in Pakistan in 2022. Therefore, there is a need to develop models which help to steer active case finding (ACF) towards populations with a high probability of having undetected TB. The aim of this study was to cross-validate TB positivity rate predictions in ACF settings of an existing Bayesian machine learning (BML) with a simpler frequentist model.Methods We conducted a retrospective analysis of cross-sectional data to identify predictors for detection of bacteriologically confirmed TB cases during ACF events in Pakistan. A predictive negative binomial regression (NBR) model was created, and the presence of spatial autocorrelation was examined to account for spatial dependencies in the outcome variable. The NBR and BML models were compared on their respective predictive precisions for the identification of TB hotspots, based on Root Mean Square Error values, k-fold cross-validation and tehsil-level (sub-district) prediction rankings.Results 407 (1.9%) bacteriologically confirmed cases among 21 227 visitors were detected in 414 ACF events between September 2020 and January 2022. In the final NBR, the spatial lag variable explained most variation in TB positivity rates across ACF events. NBR and BML predictions were similar at tehsil level. While the BML had a slightly lower root mean squared error (1.02 vs 1.03) the NBR had a slightly better fit based on the Akaike information criterion.Conclusions Statistical models can be effective in predicting TB hotspots for ACF planning, and the relatively simpler NBR model was nearly as effective as a more complex BML model. The predictions of different modelling approaches were similar, suggesting that predictions are more driven by covariates rather than modelling framework. The agreement between model results increases confidence in the potential utility of models to spatially target ACF activities in high need, low access areas. |
|---|---|
| ISSN: | 2753-4294 |