Abstract
Inferring the direction of transmission between linked individuals living with HIV provides unparalleled power to understand the epidemiology that determines transmission. State-of-the-art approaches to infer directionality use phylogenetic ancestral state reconstruction to identify the individual in whom the most recent common ancestor of the virus populations originated. However, these methods vary in their accuracy when applied to different datasets and it is currently unclear under what circumstances inferring directionality is inaccurate and when bias is more likely. To evaluate the performance of phylogenetic ancestral state reconstruction, we inferred directionality for 112 HIV transmission pairs where the direction of transmission was known, and detailed additional information was available. Next, we fit a statistical model to evaluate the extent to which epidemiological, sampling, genetic and phylogenetic factors influenced the outcome of the inference. Third, we repeated the analysis under real-life conditions when only routinely collected data are available. We found that the inference of directionality depends principally on the topology class and branch length characteristics of the phylogeny. Specifically, directionality is most correctly inferred when the phylogenetic diversity and the minimum root-to-tip distance in the transmitter is greater than that of the recipient partner and when the minimum inter-host patristic distance is large. Similarly, under real-life conditions, the probability of identifying the correct transmitter increases from 52%—when a monophyletic-monophyletic or paraphyletic-polyphyletic tree topology is observed, when the sample size in both partners is small and when the tip closest to the root does not agree with the state at the root—to 93% when a paraphyletic-monophyletic topology is observed, when the sample size is large and when the tip closest to the root agrees with the state at the root. Our results support two conclusions. First, that discordance between previous studies in inferring transmission direction can be explained by differences in key phylogenetic properties that arise due to different evolutionary, epidemiological and sampling processes; and second that easily calculated metrics from the phylogenetic tree of the transmission pair can be used to evaluate the accuracy of inferring directionality under real-life conditions for use in population-wide studies. However, given that these methods entail considerable uncertainty, we strongly advise against using these methods for individual pair-level analysis.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
CJVA and KEA were funded by an ERC Starting Grant (award number 757688) awarded to KEA. MH was funded by The HIV Prevention Trials Network (grant number H5R00701.CR00.01) and The Bill and Melinda Gates Foundation (grant number OPP1175094). JACB was supported by the MRC Precision Medicine Doctoral Training Programme (ref: 2259239). KAL was supported by The Wellcome Trust and The Royal Society grant no. 107652/Z/15/Z. JB received support from the UK MRC and the UK DFID (#MR/R010161/1) under the MRC/DFID Concordat agreement and as part of the EDCTP2 Programme supported by the European Union.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics approval was not required for this study; This study uses publicly available genetic and epidemiological data that was generated in previous studies and that was collated and described in 10.1126/science.aba5443
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
This study uses publicly available genetic and epidemiological data that was generated in previous studies and that was collated and described in 10.1126/science.aba5443. The data can be retrieved from The Los Alamos HIV and GenBank databases