HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data

Abstract Background Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess...

Full description

Saved in:
Bibliographic Details
Main Authors: Tanya Golubchik, Lucie Abeler-Dörner, Matthew Hall, Chris Wymant, David Bonsall, George Macintyre-Cockett, Laura Thomson, Jared M. Baeten, Connie L. Celum, Ronald M. Galiwango, Barry Kosloff, Mohammed Limbada, Andrew Mujugira, Nelly R. Mugo, Astrid Gall, François Blanquart, Margreet Bakker, Daniela Bezemer, Swee Hoe Ong, Jan Albert, Norbert Bannert, Jacques Fellay, Barbara Gunsenheimer-Bartmeyer, Huldrych F. Günthard, Pia Kivelä, Roger D. Kouyos, Laurence Meyer, Kholoud Porter, Ard van Sighem, Mark van der Valk, Ben Berkhout, Paul Kellam, Marion Cornelissen, Peter Reiss, Helen Ayles, David N. Burns, Sarah Fidler, Mary Kate Grabowski, Richard Hayes, Joshua T. Herbeck, Joseph Kagaayi, Pontiano Kaleebu, Jairam R. Lingappa, Deogratius Ssemwanga, Susan H. Eshleman, Myron S. Cohen, Oliver Ratmann, Oliver Laeyendecker, Christophe Fraser, the HPTN 071 (PopART) Phylogenetics protocol team, the BEEHIVE consortium and the PANGEA consortium
Format: Article
Language:English
Published: BMC 2025-08-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06189-y
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention. Results We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts. Conclusions We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level.
ISSN:1471-2105