ABSTRACT
The use of parasite genetic data by National Malaria Control Programmes (NMCPs) is currently limited, and typically focused on specific genetic features or a small number of study sites. We have developed GenRe-Mekong, a platform for genetic surveillance of malaria in the Greater Mekong Subregion (GMS). By integrating simple sample collection procedures in the routine operations of public health facilities, GenRe-Mekong enables NMCPs to conduct large-scale surveillance project in endemic regions. Samples are processed by the SpotMalaria platform, which uses high-throughput technologies to produce a broad set of genotypes, including most known drug resistance markers, species markers and a genomic barcode. Through the application of heuristics based on published evidence, GenRe-Mekong delivers Genetic Report Cards, a compendium of genotypes and phenotype predictions that are used to map prevalence of resistance to multiple drugs. To date, GenRe-Mekong has worked with NMCPs in five countries, and with several large-scale research projects, processing 9,623 samples from clinical cases. The monitoring of resistance markers has been especially valuable for NMCPs tracking the recent rapid spread of DHA-piperaquine resistant parasites across the region. In Vietnam and Laos, data from GenRe-Mekong have provided novel knowledge about the spread of these resistant strains in provinces previously thought to be unaffected. GenRe-Mekong facilitates data sharing by aggregating at regional level results from different countries, providing cross-border views of the spread of resistant strains.
INTRODUCTION
In low-income countries, particularly in sub-Saharan Africa, malaria continues to be a major cause of mortality, and intense efforts are underway to eliminate Plasmodium falciparum parasites, which cause the most severe form of the disease. However, P. falciparum has shown a remarkable ability to develop resistance to antimalarials, rendering therapies ineffective and frustrating control and elimination efforts. This problem is most acutely felt in the Greater Mekong Subregion (GMS), a region of relatively low malaria prevalence and mortality that has repeatedly been the origin of drug resistant strains,1-5 and in neighbouring countries including Bangladesh and India, where resistance could be imported. In the past, drug resistance alleles spread multiple times from the GMS to Africa, rolling back progress against the disease at the cost of many lives.6,7 In view of the recent regional emergence of parasite strains resistant to artemisinin1,8,9 and its partner drug piperaquine,10-13 the elimination of P. falciparum from the GMS has become a global health priority. Elimination from this region presents significant challenges and, to ensure the most effective outcomes, NCMPs have to evaluate multiple changing factors: efficacy of frontline treatments, available alternatives, routes of spread, location of transmission hubs, importation of cases, and so on. In these assessments, NMCPs make extensive use of clinical and epidemiological data, such as those from routine clinical reporting and therapy efficacy studies. Parasite genetic data is less frequently available, and typically restricted to single genetic variants,14 or small numbers of sites where quality sample collection protocols could be executed.15 However, the increased affordability of high-throughput sequencing technologies now offers new opportunities for public health to leverage on big data to optimize interventions where resources are limited.16 Cost-effective implementation of genomic technologies, aimed at supporting public health decision-making, can make important contributions to malaria elimination.17
Here, we describe GenRe-Mekong, a genetic surveillance project conceived to provide public health experts in the GMS with timely and actionable knowledge, to support their decision-making in malaria elimination efforts. GenRe-Mekong analyzes small dried blood spots samples, which are easy to collect at public health facilities from patients with symptomatic malaria, and uses high-throughput technologies to extract large amounts of parasite genetic information from each sample. The results are captured in Genetic Report Cards (GRCs), datasets regularly delivered to NMCPs to keep them abreast of rapid epidemiological changes in the parasite population. The underlying technological platform is designed for low sample processing costs, promoting large-scale genetic epidemiology surveys with dense geographical coverage and large sample sizes.
To date, GenRe-Mekong has worked with NMCPs in Cambodia, Vietnam, Lao PDR (Laos), Thailand and Bangladesh, and has supported large-scale multisite research and elimination projects across the region.11,18-20 The project has processed 9,623 samples from eight countries, delivering data to the 12 studies that submitted samples. In its initial phase, GenRe-Mekong has focused on applications relevant to the urgent problem of drug resistance. To facilitate integration into NMCP decision-making workflows, our analysis pipelines translate genotypes into predictions of drug resistance phenotypes, and present these as maps which are easily interpreted by public health officials with no prior training in genetics. In Laos and Vietnam, where GenRe-Mekong is implemented in dozens of public health facilities in endemic provinces, results from GenRe-Mekong have been used by NMCPs in assessments of frontline therapy options and resource allocation to combat drug resistance.
GenRe-Mekong protects individual patient privacy, while encouraging aggregation and sharing of standardized data across national borders to answer regional questions about epidemiology, gene flow, and parasite evolution.21 Aggregated data from multiple studies within GenRe-Mekong have powered large-scale genetic and clinical studies of resistance to dihydroartemisinin-piperaquine (DHA-PPQ), revealing a regional cross-border spread of specific strains.11,21 To power such high-resolution genetic epidemiology analyses of population structure and gene flow, GenRe-Mekong conducts whole-genome sequencing of selected high-quality samples, contributing to the open-access MalariaGEN Parasite Observatory (www.malariagen.net/resource/26).22 In this article, we summarize some key results from GenRe-Mekong, highlighting how they are used by public health to improve interventions. The data used in this paper are openly available, together with detailed methods documentation and details of partner studies, at www.malariagen.net/resource/29.
RESULTS
Collaborations, Site selection and Sample Collections
As of August 2019, GenRe-Mekong has partnered with NMCPs in five countries to conduct large-scale genetic surveillance (Vietnam, Laos), smaller-scale pilot projects (Cambodia, Thailand) and epidemiological surveys (Bangladesh). GenRe-Mekong also worked with large-scale research projects investigating drug efficacy and malaria risk, or piloting elimination interventions. A total of 9,623 samples from eight countries have been processed in this period (Supplementary Tables 1 and 2).
The majority of samples (n=6,905, 72%) were collected in GMS countries (Vietnam, Laos, Cambodia, Thailand, Myanmar), but GenRe-Mekong also supported projects submitting samples from Bangladesh, India and DR Congo. The vast majority of processed samples were collected prospectively, under partnership agreements with GenRe-Mekong (n=9,002, 93.5%, Supplementary Figure 1); two research projects submitted retrospective samples collected in the period 2012-2015 (n=621, 6.5%). Approximately 59% of samples (n=5,716) were submitted by NMCP partnerships, whose contribution increased over time as surveillance projects ramped up (43.4% in 2016, vs. 94.6% in 2018, Supplementary Figure 2). Details of the partnerships, the nature of the studies conducted and the number of processed samples are given in Table 1.
For each study we list the NMCP and Research partners involved, the type of study, the geographical region covered and the number of collection sites. In the last two columns, we show the total number of samples submitted, and the number included in the final set of quality-filtered sample used in epidemiology analyses.
Partnerships with NMCPs are often supported through collaborations with local malaria research groups, which provide support in implementing sample collections, and assist in the interpretation of results. To facilitate implementation in public health infrastructures, GenRe-Mekong provides template study protocols and associated documents; standardized kits of collection materials and documentation; and training for field and health centre staff. Study protocols are adapted to harmonize with local practices, and then approved by both a local ethical review board and the Oxford Tropical Research Ethics Committee (OxTREC). Informed consent forms and participant information sheets are translated to the local language(s), and public health facility staff are trained to execute sample collection procedures. Collection sites are mostly district-level or subdistrict-level health facilities, selected by NMCPs to cover the most informative endemic areas, often based on reported prevalence (Figure 1). Research studies and elimination projects included in their study protocol a sample collection procedure compatible with the standard GenRe-Mekong procedure, and sites were selected based on the study’s requirements.
Sites markers are coloured by country. One site in Kinshasa (DR Congo) not shown.
Sample processing and Genotyping
GenRe-Mekong samples consist of dried blood spots (DBSs) on filter paper. DNA extracted from the samples was selectively amplified23 to increase the proportion of parasite DNA and reduce human DNA contamination before genotyping (see Methods). The production of genetic report cards involves genotyping different types of variants: single nucleotide polymorphisms (SNPs), haplotypes, copy number variations and gene domain sequences. These operations were performed by SpotMalaria, the genotyping platform underpinning GenRe-Mekong, whose implementation evolved during the course of the project; details of the methods used in different versions are provided in the Supplementary Materials. In the initial phase, SpotMalaria used a mixture of technologies: capillary sequencing of the kelch13 gene to detect SNPs associated with artemisinin resistance;8,24 and high-throughput mass spectrometry to genotype SNP variants. This was later replaced with an amplicon sequencing process, based on short-read deep sequencing of specific portions of the parasite genome, supporting a high degree of multiplexing (see Methods). A total of 3,473 samples (36%) were processed by the amplicon sequencing platform, which delivered a higher genotyping success rate than the earlier process (94% vs 82% mean success rate for genetic barcode positions).
The vast majority of samples were taken from malaria patients upon admission (92%, n=8,866). The remainder were from recurrent clinical episodes, or collected as part of post-admission time series to study infection dynamics (n=757, 7.9%), and were excluded from epidemiological analyses in order to minimize biases and avoid duplicates. Genotypes at mitochondrial positions provided confirmation of the infecting parasite species: P. falciparum (Pf), P. vivax (Pv), P. knowlesi (Pk), P. malariae (Pm) and P. ovale (Po). All five species were detected in our dataset: non-Pf parasites were found in 8.8% of samples (n=745 out of 8,486 samples for which species could be determined). A proportion of samples (n=414, 4.9%) only tested positive for non-Pf species, possibly due to misdiagnosis or extremely low Pf parasitaemia, and were excluded from epidemiological analyses. Pv was the most commonly detected non-Pf species (317 Pf/Pv mixed infections, and 405 Pv-only infections), followed by Pk (11 Pf/Pk and 6 Pk-only infections), while Pm and Po were detected in three and two samples respectively.
Genetic Barcodes
GenRe-Mekong produces a genetic barcode for each sample to enable analyses of relatedness, diversity, multiplicity of infection and population structure. Genetic barcodes are constructed by concatenating the alleles at 101 SNPs distributed across all nuclear chromosomes (see Methods), chosen on the basis of their geographically widespread variability and their power to recapitulate genetic distance. Genetic barcodes can be used to detect loss of diversity due to demographic effects,25 or to compare parasites from the same patient to distinguish recrudescences from reinfections.26 They can also produce estimates of genetic distance, which can discriminate between populations on the regional scale. For example, a neighbour-joining tree derived from genetic distance estimates from barcodes (Figure 2) clearly separates parasites from the Thai-Myanmar border region from those circulating along the Thai-Cambodian border, consistent with findings from WGS analyses.27 Hence, while genetic barcodes produce lower resolution results than WGS data, they can be used for rapid low-cost detection of candidate imported parasites, to be further analysed using higher-definition approaches. We used genetic barcode results to discard 827 samples (9.8%) that failed to produce barcodes due to low Pf DNA content, yielding a final set of 7,626 Pf samples for epidemiological analyses.
The tree was derived from a matrix distance matrix, computed by comparing the genetic barcodes of samples. The branch length separating each pair of parasites represents the amount of genetic differentiation between them: individuals separated by shorter branches are more similar to each other. Samples from provinces/states of Myanmar, Thailand and Cambodia near to the borders were included. Each circular marker represents a sample, coloured by the province/state of origin.
Survey of drug resistance mutations
GenRe-Mekong produces genotypes covering a broad range of known variants associated to drug resistance (Table 2) to support assessment of the spread and risk of drug resistance. These include single nucleotide polymorphisms (SNPs) in genes kelch13, arps10, ferredoxin, mdr2 (resistance to artemisinin),24,27 crt (chloroquine), mdr1 (multiple drugs), dhfr (pyrimethamine), dhps (sulfadoxine), as well as a marker amplification breakpoint sequence in plasmepsin 2/3 (resistance to piperaquine).28 The interpretation of these genetic markers in phenotypic terms requires extensive knowledge of relevant literature, which is sometimes outside the domain of expertise of public health officers. To bridge this gap, we used genotypes to derive predicted phenotypes based on a set of rules (see Methods) derived from peer-reviewed publications. These rules classify samples as resistant or sensitive to a particular drug or treatment, or undetermined. Several samples had missing genotype calls which were required for phenotype prediction; therefore, we also devised a number of rules for imputation of missing genotypes based on information from linked alleles. These imputation rules (see Methods) are based on an analysis of allele associations using data from over 7,000 samples in the MalariaGEN Pf Community Project,22 and are applied prior to phenotype prediction rules. Phenotypic predictions allow simple estimations of the proportions of resistant parasites at the population level, which can be readily tabulated and mapped for use in public health decision-making. By aggregating sample data at various geographic levels (site, district, province, region, country), GenRe-Mekong delivers to NMCPs maps that capture the current drug resistance landscape, and can be compared to detect changes over time. Most GenRe-Mekong maps use intuitive “traffic light” colour schemes, in which red signifies presence of resistance, and green its absence. Below, we illustrate some results at regional level for the GMS and nearby countries, which are also summarized in Table 3.
The spread of artemisinin resistance (ART-R) is an urgent concern in the GMS. We estimated ART-R frequencies based on the presence of nonsynonymous mutations in the kelch13 gene, as listed by the World Health Organization.29 The resulting map indicates that ART-R has reached very high levels in the lower Mekong region (Cambodia, northeastern Thailand, southern Laos and Vietnam), nearing fixation in Cambodia and around its borders, with the exception of very few provinces of Laos and the Vietnam coast (Figure 3A). ART-R frequency declines to the west of this region, with no resistant samples in India and Bangladesh detected in this study, providing no evidence of spread beyond the GMA. An analysis of the distribution of kelch13 ART-R alleles (Supplementary Figure 3, Supplementary Table 3) reveals a marked difference between the lower Mekong region, where the kelch13 C580Y mutation is the dominant allele, and the region comprising Myanmar and western Thailand, where a wide variety of non-synonymous kelch13 variants are found, and C580Y is not dominant. This reflects a recent increase of C580Y mutant prevalence in Cambodia and neighbouring regions, resulting from the rapid spread of the KEL1/PLA1 strain of multidrug-resistant parasites.21,30 This hard selection sweep has replaced a variety of ART-R alleles previously present in that region, resulting from multiple soft sweeps;27,31 this process has not occurred along the Thai-Myanmar border, where allele diversity is still very pronounced. The spread of the KEL1/PLA1 strain in the lower Mekong region is further observed when we map the frequency of plasmepsin 2/3 amplifications conferring piperaquine resistance (PPQ-R, Supplementary Figure 4), showing they occur where C580Y are most prevalent. Mapping the combined presence of C580Y and plasmepsin 2/3 amplification shows that KEL1/PLA1 spread is confined to a well-defined area of the lower Mekong region, and the strain has not made its way into provinces of Laos and Vietnam where ART-R and PPQ-R alleles are circulating independently (Figure 3B). Over time, GenRe-Mekong will continue track the spread of KEL1/PLA1 across the region.
Marker text and colour indicate the proportion of sample classified as resistant in each province/state/division surveyed. A total of 6,762 samples were included in (A) and 3,395 samples in (B), after excluding samples with undetermined phenotypes. The results are summarized in Table 3.
Resistant populations can revert to sensitive haplotypes after drugs are discontinued, as was the case for chloroquine-resistant parasites in East Africa.32,33 To help detect similar trends in the GMS, GenRe-Mekong reports on markers of resistance to previous frontline antimalarials that have been discontinued because of reduced efficacy. The resulting data show that, decades after the replacement of chloroquine as frontline therapy, the frequency of resistant parasites (CQ-R) remains exceptionally high across the GMS (Supplementary Figure 5). The reasons for such sustained levels of resistance are unclear; the continued use of chloroquine as frontline treatment for P. vivax malaria could be a major contributing factor. Similarly, we found high levels of the dhfr and dhps markers associated with resistance to sulfadoxine-pyrimethamine (SP, Supplementary Figures 6 and 7). It is unclear why resistance to SP is so widespread, several years after discontinuing this therapy in the GMS, although similar results have been seen in Malawi34. It is interesting that resistance is lowest in India, where SP is still used with artesunate as the frontline ACT.35
Case Study: Vietnam
In Vietnam, sample collections were carried out by two NMCP institutes (IMPEQN and NIMPE), covering approximately 70 sites in seven provinces. Genetic report cards were delivered to public health officials over two malaria seasons, communicating new findings for malaria control. Prior to the surveillance activity, evidence of artemisinin resistance had been found in the provinces of Binh Phuoc, Gia Lai, Dak Nong, Khanh Hoa and Ninh Thuan province.36 GenRe-Mekong data confirmed the presence of resistant parasites in these provinces, and showed that the province of Dak Lak also has extremely high levels of ART-R (Supplementary Figure 8A). Furthermore, our data showed that nearly all resistant parasites collected near the border with Cambodia belonged to the KEL1/PLA1 strain, carrying the kelch13 C580Y mutation (Supplementary Figure 9) and plasmepsin 2/3 amplification (Supplementary Figure 8B-C). 21,30 C580Y parasites were also found in the coastal provinces of Ninh Thuan, Khanh Hoa and Quang Tri, but they did not carry the plasmepsin 2/3 amplification; it is therefore likely they were introduced by an earlier sweep of ART-R parasites. Several parasites in Khang Hoa carried the kelch13 P553L mutation, previously associated with an ART-R founder population in Binh Phuoc province,27,37 supporting the hypothesis they belong to an earlier sweep.
Data from consecutive seasons offers a view of the dynamics of drug resistance spread. In the 2018/2019 season, there was a marked increase in the number of cases in the Krong Pa district of Gia Lai province (Figure 4). In 2017/2018, this district accounted for 15% of cases in the three central provinces that border with Cambodia (n=96 of 656); the following season, this increased to 64% (n=341 of 529, p<10−15). In the same timeframe, KEL1/PLA1 parasites in Krong Pa rose from 65% (n=40 of 62) to 98% (n=298 of 305, p<10−14). These results suggest that an outbreak occurred in this district in 2018/2019, underpinned by strong selection of KEL1/PLA1 genetic background, because of its ability to survive the frontline ACT DHA-PPQ.
The same geographical area (Gia Lai, Dak Lak and Dak Nong provinces) is shown for two malaria seasons: 2017/18 (12 months from May 2017, n=523) and 2018/2019 (the following 12 months, n=455). Districts are represented by markers whose size is proportional to the number of samples, and whose colour indicates the frequency of samples carrying both the kelch13 C580Y mutation and the plasmepsin2/3 amplification, and thus assigned to the KEL1/PLA1 strain. Marker labels show district name, resistant parasite frequency and sample count.
Case Study: Laos
The Lao NMCP implemented genetic surveillance in five provinces of southern Laos, at over 50 public health facilities. Artemisinin resistant parasites were found in all five provinces, at frequencies higher in districts bordering Thailand and Cambodia (Figure 5A). The kelch13 C580Y mutation was found in four of the five provinces, and was the most common ART-R allele (Supplementary Figure 10). However, parasites carrying both C580Y and the plasmepsin 2-3 amplification were restricted to the two southernmost provinces (Champasak and Attapeu, referred to as “Lower Zone”, Figure 5B), and completely absent from Savannakhet and Salavan provinces (“Upper Zone”) where C580Y parasites lack the PPQ-R amplification. In other words, it appears that KEL1/PLA1 parasites, possibly imported from Cambodia or Thailand, have migrated into the Lower Zone but not the Upper Zone, where a different population of ART-R parasites circulates.
Districts in five provinces of southern Laos are represented by markers whose colour and label indicates the frequency of samples classified as artemisinin-resistant (A) and as belonging to the KEL1/PLA1 strain, resistant to both artemisinin and piperaquine. Only districts with more than 10 samples with valid genotypes are shown. In panel (B), a dashed line denotes a hypothetical demarcation line between a Lower Zone, where KEL1/PLA1 has spread, and an Upper Zone, where it is absent and ART-R parasites have a different origin.
Given the very recent aggressive spread of KEL1/PLA1, it is likely that Upper Zone parasites are remnants of a previous ART-R sweep which may also have spread from the south, as suggested by the higher frequency in Salavan province than in Savannakhet. To confirm the presence of distinct ART-R populations, we used genetic barcodes to construct a tree that recapitulates population structure in Laos (Figure 6), which clearly separates Upper Zone and Lower Zone parasites. In this tree, KEL1/PLA1 parasites form a large, tight cluster clearly separated from the kelch13 wild-type samples from the Upper Zone. The Upper Zone C580Y mutants, cluster separately from both these groups, and appear more similar to some C580Y mutants from the Lower Zone which are not PPQ-R, corroborating the hypothesis that Upper Zone mutants migrated from the South. It is likely that the northward spread of KEL1/PLA1 has been contained by the use of artemether-lumefantrine in Laos, which diminishes the survival advantage of PPQ-R parasites. However, the spread of KEL1/PLA1 across the Lower Zone, probably displacing previous ART-R strains, suggests that it is a well-adapted artemisinin-resistant strain, highly competitive even in the absence of piperaquine pressure.
The tree was derived from a matrix distance matrix, computed by comparing the genetic barcodes of samples collected in the Lao PDR (n=1,332 with <25% barcode genotypes missing). Each marker represents a parasite sample, coloured by province. The branch length separating each pair of parasites represents the amount of genetic differentiation between them: individuals separated by shorter branches are more similar to each other. Thicker marker borders indicate parasites carrying the kelch13 C580Y mutation, while square markers indicate samples with plasmepsin 2/3 amplification. Orange circular callouts show notable features of this tree. (A) shows a large cluster of parasites from the Lower Zone (Attapeu and Champasak provinces) carrying both C580Y and plasmepsin 2/3 amplification (KEL1/PLA1 strain). (B) indicates that C580Y mutants from the Upper Zone (Savannakhet and Salavan provinces) are genetically distinct from KEL1/PLA1, but also from Upper Zone wild-type parasites.
DISCUSSION
GenRe-Mekong provides a genetic surveillance platform suitable for endemic regions of low- and middle-income countries, which delivers to NMCPs detailed knowledge about the genetic epidemiology of malaria parasites, to support decision-making. Pilot studies have been conducted in all GMS countries, with the Vietnam and Laos NMCPs having implemented GenRe-Mekong on a long-term basis. GenRe-Mekong has multiple features that facilitate NMCP engagement: a sample collection procedure that easily integrates with standard medical facility workflows; standardized protocols and training to support implementation; clear presentation of results, including translation to phenotype predictions, to provide intuitive understanding and rapid communication; and support by our regional analysis team and local partners to deliver and discuss findings. GenRe-Mekong has also worked closely with research projects, contributing to their analyses of the genotyping data and supporting publication of key findings. The genetic data produced were valuable for a wide range of research applications, such as clinical studies of drug efficacy, 11 evaluation of elimination interventions,20 and epidemiological investigation of malaria importation.19
Collaborations with public health have rapidly translated into real impact for malaria control, especially where GenRe-Mekong has been implemented over multiple seasons. Genetic surveillance results were used by the Vietnam NMCP and Ministry of Health in reviews of national drug policy, which have led to revisions of frontline therapy for three provinces including Dak Lak, where an early report by GenRe-Mekong in 2018 was the first evidence of ART-R, confirmed by treatment failure data from in vivo therapy efficacy studies (TES) in 2019. In addition, our report of a KEL1/PLA1 outbreak in Gia Lai province has alerted authorities to the need to review the use of DHA-PPQ in that province. In Laos, authorities have been equally responsive, using GenRe-Mekong reports in their review of frontline therapy choices: the Ministry of Health opted against adopting DHA-PPQ based on our evidence of the expansion of KEL1/PLA1 in the Lower Zone of southern Laos. The impact has not been limited to the national level: data shared by surveillance and research projects participating in GenRe-Mekong has powered regional large-scale epidemiological analyses in the GMS and beyond, revealing patterns of spread and evolution of multidrug-resistant malaria.11 By combining results from areas populated by multidrug resistant strains with those from countries where these strains could potentially spread, such as Bangladesh and India, GenRe-Mekong maps support risk assessment and preparedness. GenRe-Mekong will continue to encourage public sharing to increase the value of genetic data generates, while respecting patient anonymity and giving recognition to those who contributed to the project.
A major advantage of genetic surveillance, compared to more costly clinical studies, is the potential for dense coverage across all endemic areas, which can identify important spatial heterogeneities across the territory. Our data suggests that a single efficacy study in Savannakhet province could have convinced authorities that DHA-PPQ could be used in Laos, with disastrous effects in the southernmost provinces. Similarly, the adoption of DHA-PPQ in Thailand was based on the drug’s efficacy in the western provinces; unfortunately, this facilitated the spread of KEL1/PLA1 into the northeast of the country. The detailed knowledge generated by GenRe-Mekong may avert such chains of events in the future.
The SpotMalaria genotyping platform is designed for extensibility, and has been expanded twice in the course of the project: to test for the newly discovered marker for the plasmepsin 2/3 amplification,28 and to add new mutations in crt which are associated to higher levels of piperaquine resistance in KEL1/PLA1 parasite.21,38,39 Such improvements will continue as new markers are identified, and new techniques developed. However, there are newer drugs such as pyronaridine, and established drugs such as lumefantrine and amodiaquine, for which clinical drug resistance markers are yet to be identified. GenRe-Mekong will support the identification of new markers in practical ways, by performing WGS on selected surveillance samples, and contributing these data to public repositories to study epidemiological effects, such as reductions in diversity, increases in cases and founder populations,31 and identify genomic regions under selection that may lead to discovering new markers. As the project develops, Genetic Report Cards will be expanded, to address new public health use cases, including those not directly related to drug resistance. For example, genetic barcodes and WGS data can be used to detect imported cases; to distinguish recrudescences from reinfections; and to measure connectedness between sites, and routes of spread.19
In the future, the integration of genetic surveillance data in public health decision-making processes will be a major focus for GenRe-Mekong, which will be addressed in several ways. First, we will make available online platforms for selecting, visualizing and retrieving genetic epidemiology data, which will provide customized views of the data. Second, we will integrate with public health information systems, such as NMCPs’ dashboards, at both national and international level. This includes sharing GenRe-Mekong data through the World Health Organization’s data visualization platform, Malaria Threats Map (http://apps.who.int/malaria/maps/threats/). Third, we will provide training and support to expand in-country expertise, developing local capacity to evaluate drug resistance data and other outputs that GenRe-Mekong will deliver in the future. Finally, we will promote in-country implementations of the SpotMalaria amplicon sequencing platform that underpins the system, to enable faster turnaround times and long-term self-sufficiency. As the adoption cycle continues, we envisage that a growing global network of public health experts will leverage on genetic surveillance to maximize the impact of their interventions, and accelerate progress towards malaria elimination.
METHODS
Additional detailed documentation on the methods used in this study is available from the article’s Resource Page, at www.malariagen.net/resource/29.
Sample Collection
GenRe-Mekong samples were collected and contributed by independent studies with different goals, geographical coverage, and sampling strategies. Studies were managed by a local partner, such as a NMCP or a research organization, and often supported by a local technical partner. Most sampling sites were district or subdistrict health centres or provincial hospitals, selected by the local partner according to their public health or research needs. Each site was assigned a code, and its geographical coordinates recorded to support result mapping. GenRe-Mekong uses a common genetic surveillance study protocol covering the entire GMS, which can be locally adapted; this protocol was used for NMCP surveillance projects, after obtaining approval by a relevant local ethics review board and by the Oxford University Tropical Research Ethics Committee (OxTREC). Research studies included in their own protocol provisions for sample collection procedures, informed consent, patient privacy protection, and data sharing compatible with those in the GenRe-Mekong protocol, and obtained ethical approval from both a relevant local ethics review board, and their relevant institutional research ethics committee.
Samples were collected from patients of all ages diagnosed with P. falciparum malaria (including patients co-infected by other Plasmodium species) confirmed by positive rapid diagnostic test (RDT) or blood smear microscopy. Participation in the study required written informed consent by patient, parent/guardian, or legally authorised representative (plus patient assent wherever required by national regulations), with the exception of Laos, where the Ministry of Health classified GenRe-Mekong as a surveillance activity for national benefit, requiring no additional informed consent. After obtaining consent, and before administering treatment, three 20μL dried blood spots (DBS) on filter paper were obtained from each patient through finger-prick. GenRe-Mekong supplied study sites with kits containing all necessary materials, including strips of Whatman 31ET CHR filter paper, disposable lancet, 20 μl micropipette, cotton swab, alcohol pad, and plastic bag with silica gel for DBS storage. Scannable barcode stickers with unique identifiers were applied on the filter paper, the sample manifest where the collection date was recorded, and the site records. Samples were identified by means of these anonymous barcodes, and no patient identifying information or clinical data were collected by GenRe-Mekong.
A number of participating studies also collected an optional anonymous questionnaire, to capture location of abode and work, occupation and travel history of the previous two months. These data are intended for in-depth epidemiological studies, such as analyses of the contribution of travel to gene flow. 19 Data from these questionnaires were stored in a separate system, and linked to genetic data by means of the tracking barcodes. They were not used in the present work.
Sample Preparation and Genotyping
DBS samples were received and stored either at the Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam, or at the MORU/WWARN molecular laboratory, Bangkok, Thailand. Samples were registered and tracked in a secure bespoke online database, where location and date of collection were recorded. DNA was extracted from samples using high-throughput robotic equipment (Qiagen QIAsymphony) according to manufacturer’s instructions. Extracted DNA was plated and shipped to the MalariaGEN Laboratory at the Wellcome Sanger Institute (WSI), Hinxton, UK, for genotyping and whole genome sequencing. Parasite DNA was amplified by applying selective whole genome amplification (sWGA) as previously described.23
Genotyping was performed by the SpotMalaria platform, described in the separate document “SpotMalaria platform - Technical Notes and Methods” available from the Resource Page, which includes the complete list of genotyped variants and the details of the genotyping procedures for these variants. Briefly, the first version of SpotMalaria used multiplexed mass spectrometry arrays on the Agena MassArray system for typing most SNPs, and capillary sequencing for the artemisinin resistance domains of the kelch13 gene. This was eventually replaced by an amplicon sequencing method, using Illumina sequencing of specific genome segments amplified by PCR reaction. The two implementations genotype a common set of variants, each iteration extending or improving on previous versions. Amplicon sequencing also offers greater portability, since it can be deployed on smaller sequencers in country-based laboratories.
Genetic Report Cards Generation
For each sample, genotypes were called for each variant analysed by SpotMalaria, and further processed to determine commonly recognized haplotypes associated with drug resistance (e.g. in genes crt, dhfr, dhps). Genetic barcodes were constructed by concatenating 101 SNP alleles. The generated genotypes, combined with sample metadata, were returned in tabular form to those partners who had submitted the samples along with explanatory documentation for the interpretation of the reports.
The genotypes generated were used to classify samples by their predicted resistance to different drugs. The prediction rules were based on the available data and current knowledge of resistance markers, and are detailed in the separate document “Mapping genetic markers to resistance status classification” available from the Resource Page. For each drug, samples were classified as “sensitive”, “resistant”, “undetermined” or “missing” - the latter identifying samples that failed to produce a valid genotype for the classification. Heterozygous samples, i.e. those containing genomes carrying both sensitive and resistant alleles, were classified as undetermined, due to lack of evidence for the drug resistance phenotype of such mixed infections.
In order to minimized the impact of call missingness, we also applied a set of imputation rules that predict missing alleles in the crt, dhfr and dhps genes, based on statistically significant association with alleles at other positions. Associations were tested (using the threshold p < 0.05 by Fisher’s exact test) using over 7,000 samples in the MalariaGEN Pf Community Project Version 6. 22 The rules for imputations were applied before phenotype prediction rules. They are detailed in the separate document “Imputation of genotypes for markers of drug resistance” available from the Resource Page.
Data aggregation and Mapping of Drug Resistance
To determine the frequency of resistant parasite for a drug, we selected samples at the desired level of geographical aggregation (e.g. province/state or district), based on sampling location. After removing samples with missing and undetermined phenotype predictions for the desired drug, we counted the individuals predicted to be resistant (nr) and sensitive (ns), giving a total aggregation sample size N=nr+ns. Resistant parasite frequency was then computed as fr=nr/N. Maps of resistance frequency were produced using Tableau Desktop 2020.1.8 (www.tableau.com). To indicate levels of resistance, markers were coloured with a custom green-orange-red palette. Pie chart markers, used to represent allele proportions, were also derived from the same set of N aggregated samples.
Population Structure Analysis
Pairwise genetic distances between parasites were estimated by comparing genetic barcodes. To reduce error due to missingness, we first eliminated samples with more than 50% missing barcode genotypes; then we removed SNPs with missing calls in >20% of the remaining samples; and finally discarded samples with >25% missingness in the remaining SNPs. This produced a dataset of 87-SNP barcodes for 7,490 samples from which genetic distances were estimated. For each sample s, we assigned a within-sample non-reference frequency gs at each position carrying a valid genotype, as follows: gs=0 if the sample carried the reference allele, gs=1 if it carried the alternative allele, gs=0.5 if both alleles were present. The distance between two samples at that position was then estimated by: d = g1(1-g2) + g2(1-g1) where g1 and g2 are the gs values for the two samples. The pairwise distance was estimated as the mean of d across all positions where d could be computed (i.e. where neither of the two samples had a missing call). Neighbour-joining trees (NJTs) were then produced using the nj implementation in the R package ape on R v4.0.2 from square distance matrices.
Role of the funding source
The funders had no role in study design, data collection, data analysis, data interpretation, or report writing. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Data Availability
All genotyping data and metadata is available from the manuscript's resource page.
AUTHOR CONTRIBUTIONS
HHQ, BHo, VV, TND, HR, NTN, MM, RM, RvdP, LVS, RF, FN, MAH, KC, PNN, EAA, SPh, RRM, RL, CHuc, LTD, KTN, TMN, TTH, HN, NZ, AAS, MO, RT, AMT, JL, DL, PJ, SPu, FS, MT, PS, PKB, SB, ARA, AF, OM organized or carried out sample collections. ED, MA, SR, MD conducted laboratory analyses. CGJ, KaR, KiR produced genomic data. CGJ, CM, JSt, JK developed informatics systems. RM, PR, CB, DM, JSi, VS, KiR, AMD, OM designed and coordinated the project. CGJ, NP, ScG, ED, KaR, KiR developed laboratory systems. CGJ, OM drafted the manuscript.
Funding
Bill & Melinda Gates Foundation, Wellcome Trust, UK Medical Research Council, UK Department for International Development, NIAID
ACKNOWLEDGMENTS
The GenRe-Mekong project is funded by the Bill & Melinda Gates Foundation (OPP11188166, OPP1204268). Genotyping and sequencing at Wellcome Sanger Institute and University of Oxford were funded by the Wellcome Trust (098051, 206194, 203141, 090770, 204911, 106698/B/14/Z) and Medical Research Council (G0600718). A proportion of samples were collected with the support of the UK Department for International Development (201900, M006212), and Intramural Research Program of the National Institute of Allergy and Infectious Diseases. We are grateful all patients and health workers who participated in samples collections. This study used data from the MalariaGEN Pf3k Project and Plasmodium falciparum Community Project. We thank the staff of Wellcome Sanger Institute Sample Logistics, Sequencing, and Informatics facilities for their contribution; in particular, we are grateful to the Wellcome Sanger Institute DNA Pipelines Informatics for supporting the development of the methods used in this work. We thank the many collaborators who contributed to the GenRe-Mekong Project, and especially: Pannapat Masingboon, Narisa Thongmee, Zoë Doran, Salwaluk Panapipat, Ipsita Sinha, Rapeephan Maude, Vilasinee Yuwaree, Tran Minh Nhat, Hoang Hai Phuc, Ro Mah Huan, Nguyen Minh Nhat, Tran Van Don. PR is a staff member of the World Health Organization; PR alone is responsible for the views expressed in this publication and they do not necessarily represent the decisions, policy or views of the World Health Organization.
Footnotes
Minor revision, following the availability of new genotype data for some of the samples analyzed, and the identification of a small number of samples from recurrent infections, previously classified as "Day 0" samples. The change have no significant effect on the headline findings of the paper.