PT - JOURNAL ARTICLE AU - Christian Julian Villabona-Arenas AU - Stéphane Hué AU - James A. C. Baxter AU - Matthew Hall AU - Katrina A. Lythgoe AU - John Bradley AU - Katherine E. Atkins TI - USING PHYLOGENETICS TO INFER HIV-1 TRANSMISSION DIRECTION BETWEEN KNOWN TRANSMISSION PAIRS AID - 10.1101/2021.05.12.21256968 DP - 2021 Jan 01 TA - medRxiv PG - 2021.05.12.21256968 4099 - http://medrxiv.org/content/early/2021/09/07/2021.05.12.21256968.short 4100 - http://medrxiv.org/content/early/2021/09/07/2021.05.12.21256968.full AB - Inferring the transmission direction between linked individuals living with HIV provides unparalleled power to understand the epidemiology that determines transmission. Phylogenetic ancestral state reconstruction approaches infer the transmission direction by identifying the individual in whom the most recent common ancestor of the virus populations originated. However, these methods vary in their accuracy but it is unclear why. To evaluate the performance of phylogenetic ancestral state reconstruction, we inferred the transmission direction for 112 HIV transmission pairs where transmission direction was known and detailed additional information was available. We then fit a statistical model to evaluate the extent to which epidemiological, sampling, genetic and phylogenetic factors influenced the outcome of the inference. We repeated the analysis under real-life conditions with only routinely-collected data. We found that the inference of transmission direction depends principally on the topology class and branch length characteristics of the phylogeny. Under real-life conditions, the probability of identifying the correct transmission direction increases from 52%—when a monophyletic-monophyletic or paraphyletic-polyphyletic tree topology is observed, when the sample size in both partners is small and when the tip closest to the root does not agree with the state at the root—to 93% when a paraphyletic-monophyletic topology is observed, when the sample size is large and when the tip closest to the root agrees with root state. Our results suggest that discordance between previous studies in inferring the transmission direction can be explained by differences in key phylogenetic properties that arise due to different evolutionary, epidemiological and sampling processes.Significance Statement Identifying the direction of infectious disease transmission between individuals provides unparalleled power to understand infectious disease epidemiology. With epidemiological and clinical information typically unable to distinguish the direction, phylogenetic analysis of pathogen sequence data is an alternative approach. However, when these phylogenetic methods have been implemented, their accuracy is highly variable, and the reasons for this discordance is unknown. Here we analyse sequence data from over 100 pairs of individuals for whom both the direction of transmission of HIV is known and detailed epidemiological and sampling information is available. We find that easily quantifiable phylogenetic characteristics discriminate whether a phylogenetically-inferred transmission direction is correct. Our analysis highlights that phylogenetic approaches are unsuitable for individual-level analysis such as forensic investigations.Competing Interest StatementThe authors have declared no competing interest.Funding StatementCJVA and KEA were funded by an ERC Starting Grant (award number 757688) awarded to KEA. MH was funded by The HIV Prevention Trials Network (grant number H5R00701.CR00.01) and The Bill and Melinda Gates Foundation (grant number OPP1175094). JACB was supported by the MRC Precision Medicine Doctoral Training Programme (ref: 2259239). KAL was supported by The Wellcome Trust and The Royal Society grant no. 107652/Z/15/Z. JB received support from the UK MRC and the UK DFID (#MR/R010161/1) under the MRC/DFID Concordat agreement and as part of the EDCTP2 Programme supported by the European Union.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Ethics approval was not required for this study; This study uses publicly available genetic and epidemiological data that was generated in previous studies and that was collated and described in 10.1126/science.aba5443All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThis study uses publicly available genetic and epidemiological data that was generated in previous studies and that was collated and described in 10.1126/science.aba5443. The data can be retrieved from The Los Alamos HIV and GenBank databases