HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data
Abstract Background Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-08-01
|
| Series: | BMC Bioinformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12859-025-06189-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849738049800372224 |
|---|---|
| author | Tanya Golubchik Lucie Abeler-Dörner Matthew Hall Chris Wymant David Bonsall George Macintyre-Cockett Laura Thomson Jared M. Baeten Connie L. Celum Ronald M. Galiwango Barry Kosloff Mohammed Limbada Andrew Mujugira Nelly R. Mugo Astrid Gall François Blanquart Margreet Bakker Daniela Bezemer Swee Hoe Ong Jan Albert Norbert Bannert Jacques Fellay Barbara Gunsenheimer-Bartmeyer Huldrych F. Günthard Pia Kivelä Roger D. Kouyos Laurence Meyer Kholoud Porter Ard van Sighem Mark van der Valk Ben Berkhout Paul Kellam Marion Cornelissen Peter Reiss Helen Ayles David N. Burns Sarah Fidler Mary Kate Grabowski Richard Hayes Joshua T. Herbeck Joseph Kagaayi Pontiano Kaleebu Jairam R. Lingappa Deogratius Ssemwanga Susan H. Eshleman Myron S. Cohen Oliver Ratmann Oliver Laeyendecker Christophe Fraser the HPTN 071 (PopART) Phylogenetics protocol team, the BEEHIVE consortium and the PANGEA consortium |
| author_facet | Tanya Golubchik Lucie Abeler-Dörner Matthew Hall Chris Wymant David Bonsall George Macintyre-Cockett Laura Thomson Jared M. Baeten Connie L. Celum Ronald M. Galiwango Barry Kosloff Mohammed Limbada Andrew Mujugira Nelly R. Mugo Astrid Gall François Blanquart Margreet Bakker Daniela Bezemer Swee Hoe Ong Jan Albert Norbert Bannert Jacques Fellay Barbara Gunsenheimer-Bartmeyer Huldrych F. Günthard Pia Kivelä Roger D. Kouyos Laurence Meyer Kholoud Porter Ard van Sighem Mark van der Valk Ben Berkhout Paul Kellam Marion Cornelissen Peter Reiss Helen Ayles David N. Burns Sarah Fidler Mary Kate Grabowski Richard Hayes Joshua T. Herbeck Joseph Kagaayi Pontiano Kaleebu Jairam R. Lingappa Deogratius Ssemwanga Susan H. Eshleman Myron S. Cohen Oliver Ratmann Oliver Laeyendecker Christophe Fraser the HPTN 071 (PopART) Phylogenetics protocol team, the BEEHIVE consortium and the PANGEA consortium |
| author_sort | Tanya Golubchik |
| collection | DOAJ |
| description | Abstract Background Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention. Results We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts. Conclusions We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level. |
| format | Article |
| id | doaj-art-d0cd34dad9aa4ecb9eede01d14c17bbd |
| institution | DOAJ |
| issn | 1471-2105 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Bioinformatics |
| spelling | doaj-art-d0cd34dad9aa4ecb9eede01d14c17bbd2025-08-20T03:06:43ZengBMCBMC Bioinformatics1471-21052025-08-0126112110.1186/s12859-025-06189-yHIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence dataTanya Golubchik0Lucie Abeler-Dörner1Matthew Hall2Chris Wymant3David Bonsall4George Macintyre-Cockett5Laura Thomson6Jared M. Baeten7Connie L. Celum8Ronald M. Galiwango9Barry Kosloff10Mohammed Limbada11Andrew Mujugira12Nelly R. Mugo13Astrid Gall14François Blanquart15Margreet Bakker16Daniela Bezemer17Swee Hoe Ong18Jan Albert19Norbert Bannert20Jacques Fellay21Barbara Gunsenheimer-Bartmeyer22Huldrych F. Günthard23Pia Kivelä24Roger D. Kouyos25Laurence Meyer26Kholoud Porter27Ard van Sighem28Mark van der Valk29Ben Berkhout30Paul Kellam31Marion Cornelissen32Peter Reiss33Helen Ayles34David N. Burns35Sarah Fidler36Mary Kate Grabowski37Richard Hayes38Joshua T. Herbeck39Joseph Kagaayi40Pontiano Kaleebu41Jairam R. Lingappa42Deogratius Ssemwanga43Susan H. Eshleman44Myron S. Cohen45Oliver Ratmann46Oliver Laeyendecker47Christophe Fraser48the HPTN 071 (PopART) Phylogenetics protocol team, the BEEHIVE consortium and the PANGEA consortiumPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordPandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordDepartment of Global Health, University of WashingtonDepartment of Global Health, University of WashingtonRakai Health Sciences ProgramLondon School of Hygiene and Tropical MedicineLondon School of Hygiene and Tropical MedicineInfectious Diseases Institute, Makerere UniversityDepartment of Global Health, University of WashingtonEuropean Molecular Biology Laboratory, European Bioinformatics InstituteCentre for Interdisciplinary Research in Biology (CIRB), Collège de France, CNRS, INSERM, PSL Research UniversityMedical Microbiology and Infection Prevention, Amsterdam UMC, Location AMCStichting HIV Monitoring, Amsterdam UMC, Location AMCWellcome Sanger InstituteDepartment of Microbiology, Tumor and Cell Biology, Karolinska InstitutetDivision for HIV and Other Retroviruses, Robert Koch InstituteSchool of Life Sciences, Ecole Polytechnique Fédérale de LausanneDepartment of Infectious Disease Epidemiology, Robert Koch-InstituteDivision of Infectious Diseases and Hospital Epidemiology, University Hospital ZurichDepartment of Infectious Diseases, Helsinki University HospitalDivision of Infectious Diseases and Hospital Epidemiology, University Hospital ZurichINSERM CESP U1018, APHP, Service de Santé Publique, Hôpital de Bicêtre, Université Paris SaclayInstitute for Global Health, University College LondonStichting HIV Monitoring, Amsterdam UMC, Location AMCAmsterdam UMC Location MeibergdreefMedical Microbiology and Infection Prevention, Amsterdam UMC, Location AMCDepartment of Infectious Diseases, Department of Medicine, Imperial College LondonMedical Microbiology and Infection Prevention, Amsterdam UMC, Location AMCStichting HIV Monitoring, Amsterdam UMC, Location AMCLondon School of Hygiene and Tropical MedicineDivision of AIDS, National Institute of Allergy and Infectious Diseases, National Institutes of HealthDepartment of Infectious Disease Epidemiology, Imperial CollegeDepartment of Epidemiology, Johns Hopkins Bloomberg School of Public HealthLondon School of Hygiene and Tropical MedicineInstitute for Disease ModelingRakai Health Sciences ProgramMedical Research Council (MRC), Uganda Virus Research Institute (UVRI)Department of Global Health, University of WashingtonMedical Research Council (MRC), Uganda Virus Research Institute (UVRI)Department of Pathology, Johns Hopkins University School of MedicineDepartment of Medicine, University of North Carolina at Chapel HillDepartment of Mathematics and Imperial-X, Imperial CollegeDivision of Intramural Research, National Institute of Allergy and Infectious Disease, National Institutes of MedicinePandemic Sciences Institute and Big Data Institute, Nuffield Department of Medicine, University of OxfordAbstract Background Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention. Results We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts. Conclusions We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level.https://doi.org/10.1186/s12859-025-06189-yHIVNext-generation sequencingRandom forestRecency of infectionTime since infection |
| spellingShingle | Tanya Golubchik Lucie Abeler-Dörner Matthew Hall Chris Wymant David Bonsall George Macintyre-Cockett Laura Thomson Jared M. Baeten Connie L. Celum Ronald M. Galiwango Barry Kosloff Mohammed Limbada Andrew Mujugira Nelly R. Mugo Astrid Gall François Blanquart Margreet Bakker Daniela Bezemer Swee Hoe Ong Jan Albert Norbert Bannert Jacques Fellay Barbara Gunsenheimer-Bartmeyer Huldrych F. Günthard Pia Kivelä Roger D. Kouyos Laurence Meyer Kholoud Porter Ard van Sighem Mark van der Valk Ben Berkhout Paul Kellam Marion Cornelissen Peter Reiss Helen Ayles David N. Burns Sarah Fidler Mary Kate Grabowski Richard Hayes Joshua T. Herbeck Joseph Kagaayi Pontiano Kaleebu Jairam R. Lingappa Deogratius Ssemwanga Susan H. Eshleman Myron S. Cohen Oliver Ratmann Oliver Laeyendecker Christophe Fraser the HPTN 071 (PopART) Phylogenetics protocol team, the BEEHIVE consortium and the PANGEA consortium HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data BMC Bioinformatics HIV Next-generation sequencing Random forest Recency of infection Time since infection |
| title | HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data |
| title_full | HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data |
| title_fullStr | HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data |
| title_full_unstemmed | HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data |
| title_short | HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data |
| title_sort | hiv phylotsi subtype independent estimation of time since hiv 1 infection for cross sectional measures of population incidence using deep sequence data |
| topic | HIV Next-generation sequencing Random forest Recency of infection Time since infection |
| url | https://doi.org/10.1186/s12859-025-06189-y |
| work_keys_str_mv | AT tanyagolubchik hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT lucieabelerdorner hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT matthewhall hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT chriswymant hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT davidbonsall hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT georgemacintyrecockett hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT laurathomson hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT jaredmbaeten hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT connielcelum hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT ronaldmgaliwango hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT barrykosloff hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT mohammedlimbada hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT andrewmujugira hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT nellyrmugo hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT astridgall hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT francoisblanquart hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT margreetbakker hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT danielabezemer hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT sweehoeong hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT janalbert hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT norbertbannert hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT jacquesfellay hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT barbaragunsenheimerbartmeyer hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT huldrychfgunthard hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT piakivela hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT rogerdkouyos hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT laurencemeyer hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT kholoudporter hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT ardvansighem hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT markvandervalk hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT benberkhout hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT paulkellam hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT marioncornelissen hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT peterreiss hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT helenayles hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT davidnburns hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT sarahfidler hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT marykategrabowski hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT richardhayes hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT joshuatherbeck hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT josephkagaayi hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT pontianokaleebu hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT jairamrlingappa hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT deogratiusssemwanga hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT susanheshleman hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT myronscohen hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT oliverratmann hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT oliverlaeyendecker hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT christophefraser hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata AT thehptn071popartphylogeneticsprotocolteamthebeehiveconsortiumandthepangeaconsortium hivphylotsisubtypeindependentestimationoftimesincehiv1infectionforcrosssectionalmeasuresofpopulationincidenceusingdeepsequencedata |