Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Estimating dengue transmission intensity from serological data: a comparative analysis using mixture and catalytic models

Victoria Cox, Megan O’Driscoll, Natsuko Imai, Ari Prayitno, Sri Rezeki Hadinegoro, Anne-Frieda Taurel, Laurent Coudeville, Ilaria Dorigatti
doi: https://doi.org/10.1101/2021.10.29.21259708
Victoria Cox
1MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: v.cox@imperial.ac.uk
Megan O’Driscoll
1MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, United Kingdom
2Department of Genetics, University of Cambridge, Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Natsuko Imai
1MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ari Prayitno
3Department of Child Health, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sri Rezeki Hadinegoro
3Department of Child Health, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anne-Frieda Taurel
4Sanofi Pasteur, Singapore
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laurent Coudeville
5Sanofi Pasteur, Lyon, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ilaria Dorigatti
1MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Dengue virus (DENV) infection is a global health concern of increasing magnitude. To target intervention strategies, accurate estimates of the force of infection (FOI) are necessary. Catalytic models have been widely used to estimate DENV FOI and rely on a binary classification of serostatus as seropositive or seronegative, according to pre-defined antibody thresholds. Previous work has demonstrated the use of thresholds can cause serostatus misclassification and biased estimates. In contrast, mixture models do not rely on thresholds and use the full distribution of antibody titres. To date, there has been limited application of mixture models to estimate DENV FOI.

Methods We compare the application of mixture models and time-constant and time-varying catalytic models to simulated data and to serological data collected in Vietnam from 2004 to 2009 (N ≥ 2178) and Indonesia in 2014 (N = 3194).

Results The simulation study showed greater estimate bias from the time-constant and time-varying catalytic models (FOI bias = 1.3% (0.05%, 4.6%) and 2.3% (0.06%, 7.8%), seroprevalence bias = 3.1% (0.25%, 9.4%) and 2.9% (0.26%, 8.7%), respectively) than from the mixture model (FOI bias = 0.41% (95% CI 0.02%, 2.7%), seroprevalence bias = 0.11% (0.01%, 3.6%)). When applied to real data from Vietnam, the mixture model frequently produced higher FOI and seroprevalence estimates than the catalytic models.

Conclusions Our results suggest mixture models represent valid, potentially less biased, alternatives to catalytic models, which could be particularly useful when estimating FOI and seroprevalence in low transmission settings, where serostatus misclassification tends to be higher.

Author summary Characterising the transmission intensity of dengue virus in different geographic areas over time is essential to understand who is at greatest risk of infection, and to inform the implementation of interventions, such as vector control and vaccination. It is therefore important to understand how methodological differences and model choice may influence estimates of transmission intensity. We compared the application of catalytic and mixture models to calculate the force of infection (FOI) of dengue virus from antibody titre data. We observed greater bias in FOI estimates obtained from catalytic models than from mixture models in areas where the transmission intensity was low. In high transmission intensity areas, catalytic and mixture models produced consistent estimates. Our results indicate that in low transmission settings, when antibody titre data are available, mixture models could be preferential to estimate dengue virus FOI.

Introduction

Dengue fever is caused by infection with one or more of four antigenically distinct serotypes of dengue virus (DENV1-4); a Flavivirus carried by Aedes mosquitoes (1,2). DENV infects approximately 105 million people each year (3), primarily in tropical and sub-tropical regions. The geographical range of DENV is increasing (1,4,5) and it is expected that the spread of dengue will be influenced by rising global temperatures and increasing urbanisation (1,6). Intervention measures to date rely essentially on vector control due to the absence of antiviral treatment, challenges in the use of the first licensed dengue vaccine for widespread dengue prevention and control (7), as well as in the use of rapid diagnostic tests for screening (8). The current and expected future burden of dengue on health-systems is therefore high, demonstrating a continuing need for increased understanding of DENV transmission.

Estimating epidemiological parameters such as the force of infection (FOI, the per capita rate at which a susceptible person is infected) and population seroprevalence (the proportion of people in a population exposed to a virus, as determined by the detection of antibodies in the blood) allow us to gain insights into the subsets of populations most at risk of infection and disease (9), to assess the predicted impact of an intervention strategy (10), and to inform public health policy (11,12).

Both the FOI and seroprevalence can be estimated using mathematical models calibrated to serological data measuring IgG antibody levels (also called titres) from blood samples. IgG titres are obtained using Enzyme-Linked Immunosorbent Assays (ELISAs) and are often classified into qualitative, binary test results (seropositive or seronegative) based on the manufacturer’s threshold.

Catalytic models, first proposed in 1934, estimate disease FOI from age-stratified serological or case notification data (13). In these models, large rates of increase in seroprevalence between individuals who are age a versus age a+1 are explained by high age-specific FOI (assuming the FOI is constant in time) or high time-specific FOI experienced by individuals of all ages during the period a to a+1 years ago (14). Catalytic models have been used extensively for measles (15), rubella (16), Hepatitis A (17), Chagas disease (18), and DENV (12,14,19–21). Whilst commonly used, Bollaerts et al. and Vink et al. (22,23) suggest that catalytic models risk generating biased estimates due to data-loss and/or mis-classification. For example, samples with titres greater than the seronegative threshold but lower than the seropositive threshold are classified as ‘equivocal’ and discarded from the analysis. Furthermore, titre levels of seropositive individuals in a given population may be affected by factors including host response, the degree of exposure to the pathogen and infection timing, which could lead to misclassification.

Mixture models are flexible statistical models that can be applied to continuous data from different clusters or populations, called components. Mixture models can therefore be applied to the absolute antibody titre values in serology datasets, rather than to the counts of titres in each of two classes (seropositive/seronegative) as is necessary for catalytic models (22). The components’ distributions and their defining parameters (e.g., the mean titre of each component distribution) are inferred from a fitted mixture model which is used to estimate the FOI and population seroprevalence (22,24). To date, mixture models have been applied to serological data to estimate the seroprevalence of infectious diseases such as parvovirus B19 and rubella in England (25,26), human papillomavirus in the Netherlands (23), measles in Italy (27), and a selection of arboviruses inlcuding DENV in Zambia (28). In addition, mixture models have been used to develop a framework capable of distinguishing between primary and post-primary DENV infections (29) and to estimate the FOI of Varicella-Zoster virus in Belgium (22). However, these methods have not been used to estimate DENV FOI. Here, we present the results of a simulation study and a comparison of the DENV trasmission intensity estimates obtained from mixture and catalytic models applied to age-stratified serological datasets from Vietnam (N ≥ 2178, for years 2004-2009) and Indonesia (N = 3194, for 2014).

Methods

Data

Age-stratified seroprevalence data

DENV IgG data were collected in Long Xuyen, Vietnam, during a prospective epidemiological study that was conducted to assess the suitability of the area for future CYD-TDV vaccine efficacy trials, as described previously (30). Samples were collected from children under 11 years old in 2004 and then from children under 15 years old during September to February in 2004-2005, 2005-2006, 2006-2007, 2007-2008 and 2008-2009 (Datasets A-1 to A-6, Table 1). The titres were measured using in-house ELISA assays (Arbovirus Laboratory of Pasteur Institute, Ho Chi Minh City).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1: Description of the datasets used in the analyses.

Summary statistics including notation, region, the assay used, the year of testing, the age range of the children participating to the study and the sample sizes.

DENV IgG data from 30 urban subdistricts in Indonesia were collected from children under 18 years old as part of a cross-sectional seroprevalence survey in 2014 (31) (Dataset B). The total number of participants was 3,194 (see Supplementary Data for subdistrict information). IgG titres were measured using the commercial Panbio® Dengue IgG Indirect ELISA kit.

Simulated datasets

We simulated 200 antibody titre datasets (Dataset C), with the same age-distribution and sample size as the Indonesian seroprevalence survey data (Dataset B). For each simulation the distributions used for sampling seronegative and seropositive antibody titres were randomly selected from a normal, gamma or Weibull distribution. This gave 9 possible combinations of distribution pairs for seronegative and seropositive antibody titre values per simulation. Normal, gamma and Weibull distributions were chosen based on preliminary work where mixture models applied to our antibody titre datasets most commonly fitted one of these three distributions. Parameter values were randomly drawn from uniform distributions with limits as shown in S2 Table. The serostatus of each individual was drawn from a Bernoulli distribution with probability 1-e-λ a, where a is the age of the individual and λ is the FOI. FOI was assumed to be constant with age and time. Titre values for each individual were subsequently randomly drawn from the respective component distributions. The analysis was conducted in the statistical programming language R (32).

Ethics statement

Ethical approval for the secondary analysis of the age-stratified seroprevalence datasets was granted by the Imperial College Research Ethics Committee (Approval Reference 21IC7066).

Catalytic model

Data preparation

Catalytic models rely on data that are binarily classified as seropositive or seronegative. For Datasets A-1 to A-6, a background/control titre (t) was measured for each assay. An individual titre was classified as seronegative if ≤t and seropositive otherwise. For Dataset B, samples with titres ≤ 9 PanBio units were classified as seronegative and ≥11 as seropositive. Titres >9 and <11 were discarded (28 out of 3,194 samples). For simulated Dataset C, titres were classified as seronegative if they were ≤X and seropositive if they were ≥Y. X and Y are thresholds that were optimised using the ‘true’ simulated serostatuses: the optim function in R, using the Nelder Mead algorithm, was used to calulate the X and Y values per simulated dataset resulting in the fewest titre misclassifications. The mean number of samples discarded (titre value >X and <Y) per simulation was 1.06 out of 3194 (median = 0, interquartile range (IQR) = 1, range = 0 to 19). The mean percentage of titres misclassified across the simulations was 2.4% (median = 0.6%, IQR = 2.0%, range = 0% to 69.4%).

Parameter estimation

We used the same catalytic model developed in Rodriguez-Barraquer et al., 2011 (33). The proportion of seropositive individuals in age group i (π(ai)), was estimated as in Equation 1, where, λ(ai) denotes the FOI, i.e. the per-capita rate of infection experienced by individuals in age group i, and a denotes the age (14). Embedded Image

A binomial log-likelihood was assumed for the FOI (Equation 2), where Ni is the total number of individuals and Xi is the number of seropositive individuals in age group i (19). The maximum likelihood estimate of the FOI parameter is estimated as shown in Equation 2 using the optim function in R. Embedded Image

When we assume the FOI is time-varying, the estimated λ(ai) values were averaged across the age groups (number of age groups = n) to give the mean yearly FOI experienced over the study period (Equation 3). When we assume the FOI is constant with time, λ(ai) equals, λ. Embedded Image

We estimated 95% Confidence Intervals (CI) using a bootstrap method, where the data were sampled with replacement 1000 times. The 95% CI was given by the 2.5% to 97.5% quantiles of the estimates from the bootstrap samples.

Mixture model

Applying the mixture models to the titre distributions

Mixture models were applied to the bimodal distribution of individual antibody titres as described in Bollaerts et al., 2012 and Hens et al., 2012. The model defines the distribution of the log(titres + 1) (z) as a mixture of two distinct distributions: one for susceptible individuals (seronegative, zs) and one for individuals who have been previously infected (seropositive, zI). The two-component mixture model is represented by: Embedded Image where fs and fI represent the probability density function of the seronegative and seropositive components, respectively, and where μ and σ represent the mean and standard deviation of each component. The age-dependent mean log(titre+1) (μ(ai)) is calculated as in Equation 5, which corresponds to an expression for the age-specific seroprevalence (π(ai)) (Equation 6). Embedded Image Embedded Image

The age-dependent FOI, λ(ai), can be derived from the seroprevalence as described in Equation 7 (22) and in turn can be expressed as a function of the underlying antibody titre distribution as shown in Equation 8, where μ’(ai) represents the derivative of the age-specific mean log(titre+1). The age-dependent FOI can be averaged across the age groups to give the total FOI, λ (Equation 3). Embedded Image Embedded Image

The mixdist R package (34) was used to fit the mixture models to the titre data using an Expectation Maximisation (EM) algorithm. The package was adapted to allow fitting of different distributions for the seronegative and seropositive titre components: normal, gamma and Weibull distributions were fitted for each component, giving 9 possible combinations. The model giving the best fit to the titre distribution (determined by minimising the log-likelihood ratio statistic) among the 9 combinations tested was chosen and its estimates of the means (μs and μI) and standard deviations (σS and σI) for the seronegative and seropositive components were extracted (and for Dataset C they were compared, as well as the FOI, λ, against the true simulated parameter values). We explored multiple parameterisations, including fixing the standard deviation of the two mixture components. For the Vietnamese datasets, we optimised the model having constrained the standard deviation of the seropositive component to multiple different values (for Dataset A-4 we set σI equal to all values from 0.02 to 0.08 in steps of 0.01, for the other five Datasets we set σI equal to all values from 0.05, to 0.15 in steps of 0.01). For the Indonesian dataset (Dataset B) the standard deviations of both components were constrained (σS was set equal to 0.10 to 0.15 in steps of 0.001, and σI was set equal to 0.15 to 0.3 in steps of 0.05).

Parameter estimation

The distribution mean, μ(a), was estimated by least-squares regression using a monotonically increasing P spline fitted to the age-stratified log(titre+1) data (22,24,35). Equally spaced cubic polynomial segments (degree = 3) made up the spline. The optimal smoothing parameter (α) and number of segments (knots) were determined using the Bayesian Information Criterion, having explored combinations of α values (set equal to 0.001, 0.01, 0.1, 0.5, 1, 5, 10, 50, 100, 500) and knots (set equal to values in the sequence: 5 to the maximum number of x-axis age categories, step size = 1). The spline was fitted using the mpspline.fit function from the serostat R package (36). The μ’(a) term for the FOI estimation was calculated by taking the gradient of the fitted spline at each age. The 95% CI around the parameter estimates were calculated using a boostrap method, as described above. Code to recreate the simulated datasets and to run the mixture model analysis are available at: https://github.com/Tori-Cox/Mixture-catalytic-models.

Results

Simulated data

The mixture model identified the correct distributions used to simulate both seropositive and seronegative titres in 50% (100/200) of simulations, one of the two distributions in 42.5% (85/200) of simulations and did not correctly identify either distribution in 7.5% (15/200) of the simulations. In all 15 simulations where neither distribution was correctly identified, the seronegative titre values were drawn from the Weibull distribution (S3 Table). The estimated 95% CIs contained the true parameter values used to simulate the data in 87% (174/200), 92% (184/200), 80.5% (161/200) and 90.5% (181/200) of simulations for μs, μI, σS and σI, respectively (S5 Figure).

The mixture model correctly estimated the FOI for 99% (198/200) and the seroprevalence for 97.5% (195/200) of simulations. The time-varying catalytic model correctly estimated the FOI for 98% (196/200) and the seroprevalence for 85% (170/200) of simulations. The time-constant catalytic model correctly estimated the FOI and seroprevalence for 8.5% (17/200) and 16.5% (33/200) of simulations, respectively. The estimates were categorised as correct if the estimated 95% CIs contained the true values. It should be noted that the time-varying catalytic model produced wider CIs compared to when assuming a constant-in-time FOI (Figure 1). Bias was calculated as the estimated parameter values minus the true parameter values. Absolute bias in the FOI estimates (0.41% (95% CI 0.02%, 2.7%), 1.3% (0.05%, 4.6%) and 2.3% (0.06%, 7.8%) for the mixture, time-constant and time-varying catalytic models, respectively) and the seropreavelance estimates (0.11% (0.01%, 3.6%), 3.1% (0.25%, 9.4%) and 2.9% (0.26%, 8.7%)) was smaller for the mixture model estimates (Figure 1).

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: Bias, coverage, and degree of uncertainty for seroprevalence and force of infection (FOI) estimates using catalytic and mixture models fitted to simulated datasets (Dataset C).

Bias is calculated as the estimated parameter value minus the true parameter value. Uncertainty is the width of the 95% Confidence intervals (CIs) around the central estimates, calculated using the bootstrap method. The coverage is the percentage of simulations where the estimated CIs contained the true values. For the bias and the uncertainty, the mean and 95% CI across the 200 simulations are given. For the coverage, the 95% exact binomial CI are given. Note that the y-axis limits differ for each panel.

The catalytic model under both assumptions underestimated the FOI and seroprevalence, particularly at higher values (Figure 1 and Figure 2). High titre misclassification error rates were positively associated with increased parameter estimate bias in the catalytic models (S7 Figure). As expected, parameter estimate bias was reduced when we fit the catalytic models to the simulated ‘true serostatus’ instead of classifying the titres using optimised thresholds: the FOI was then correctly estimated for 99.5% (199/200) and 48% (96/200) of simulations by the time-varying and time-constant FOI catalytic models, respectively.

Figure 2:
  • Download figure
  • Open in new tab
Figure 2: True versus estimated seroprevalence and force of infection (FOI) values from the mixture and catalytic models fitted to the simulated datasets (Dataset C).

The catalytic model was run under the assumption that the FOI was time-constant or time-varying. The 95% Confidence Intervals for the estimated values were calculated using a bootstrap method and are shown here as error bars; the point denotes the central estimate.

Long Xuyen, Vietnam data

When we applied the mixture model (Figure 3) to the data from Long Xuyen, Vietnam, the total population-level seroprevalence estimates ranged from 16.3% (95% CI 13.8%-18.8%) in 2004 to 37.6% (24.9%-40.3%) in 2005-2006. The seroprevalence estimates from the time-constant and time-varying catalytic models were consistent with each other, with the latter ranging from 18.9% (16.3%-21.7%) in 2006-2007 to 29.9% (26.2%-33.7%) in 2008-2009. The seroprevalence estimates from all three models were consistent (as determined by the 95% CIs) for 4 out of 6 datasets (Figure 4, S4 Table). The general trend in the age-specific seroprevalence estimates, specifically for Datasets A-2:A-5, differed significantly between the mixture model and the catalytic models, with the mixture model estimating higher seroprevalence at the older ages (Figure 5).

Figure 3:
  • Download figure
  • Open in new tab
Figure 3: Mixture model fitted to the Vietnamese (A1:A6) and Indonesian (B) datasets.

The distribution of log(titre+1) is shown in dark grey, the fitted mixture model is shown in blue, and the red dashed lines represent the mean antibody titre of each component of the fitted mixture model (μs and μI for the seronegative and seropositive components respectively). Note that the y-axis limits differ for each panel.

Figure 4:
  • Download figure
  • Open in new tab
Figure 4: Force of infection (FOI) and total population level seroprevalence (SP) estimates from the mixture model and the catalytic models fitted to the observed data.

The catalytic model was run under the time-constant and time-varying FOI assumption. The 95% Confidence Intervals (CI) which were calculated by bootstrapping for all models are given as error bars. Note that the y-axis limits differ for each panel.

Figure 5:
  • Download figure
  • Open in new tab
Figure 5: Age-specific seroprevalence estimates for the IgG data from Vietnam (Dataset A1:A6) and from Indonesia (Dataset B) aggregated across all subdistricts.

Mixture model estimates are in blue, catalytic model estimates are in orange and green when applied under the assumption that the FOI is time-constant or time-varying respectively. Shading represents the 95% Confidence Intervals (CI). The grey points show the observed seroprevalence per age group calculated from the binarily classified IgG data (seropositive individuals / tested individuals), with error bars indicating the 95% exact binomial CIs. The seroprevalence data and model estimates are overlayed for the purpose of comparison. However, it is important to note that the mixture model was not fitted to the data (grey points), as the former does not depend on the titre classification. The size of the grey data points represents the number of individuals tested in each age group.

The FOI estimates ranged from 2.6% (95% CI 1.9-3.3%) in 2004 to 9.9% (7.7%-12.4%) in 2004-2005 for the mixture model and 2.4% (0.7%-5.8%) in 2006-2007 to 5.0% (1.2%-11.8%) in 2004-2005 for the time-varying catalytic model. The FOI estimates from the mixture model versus the time-varying catalytic model were consistent for 6 out of 6 datasets, and versus the time-constant catalytic model they were consistent for 3 out of 6 datasets (Figure 4, S4 Table). There is a higher degree of uncertainty around the estimates from the catalytic model when assuming a time-varying FOI compared to the time-constant FOI assumption (Figure 4). We observe greater differences in the estimates from each model when comparing the age-specific FOI as opposed to the averaged total FOI (S9 Figure).

Indonesian data

The mixture and catalytic models fitted to the Indonesian data produced consistent FOI, total population seroprevalence and age-specific seroprevalence estimates. The FOI was estimated at 15.4% (95% CI: 10.6%-21.3%), 14.3% (13.6%-15.0%) and 16.4% (2.2%-81.4%), and the seroprevalence at 71.8% (69.4%-74.1%), 70% (68.6%-71.4%) and 70.0% (65.5%-74.3%) by the mixture model and the time-constant and time-varying catalytic models, respectively (Figure 4, S4 Table).

Discussion

In our simulation study, the mixture model produced less biased estimates of FOI and seroprevalence than the catalytic models (Figure 1). The catalytic models consistently underestimated the parameters, particularly at higher values (Figure 2). Increased bias in the catalytic model estimates was associated with increased serostatus misclassification (S7 Figure). Serostatus misclassification occurred more often in simulations where the mean log(titre + 1) for the susceptible/seronegative component was higher and/or the mean log(titre + 1) for the infected/seropositive component was lower (S6 Figure), indicating greater overlap between the distributions of the two components. It is worth noting that the apparent poor performance of the catalytic model is directly related to our choice of simulated parameter values, and we would expect better performance when using parameters that would have given clearer bimodal titre distributions, where serostatus misclassification would have been lower.

Differences in the degree of overlap between components in real serological datasets (S8 Figure) reflect differences in transmission intensity and variable degrees of spatiotemporal heterogeneities in the risk of infection (3,37). In datasets collected from areas which experience hyperendemic DENV circulation, one expects greater separation between the titre components because most seropositive individuals likely have had multiple infections, translating to higher antibody titres. In these instances, catalytic and mixture models are expected to produce more similar estimates of FOI and seroprevalence as fewer samples are misclassified during the qualitative, binary classification of the data needed to calibrate catalytic models (22,38). This is consistent with the reduced variability we observe in our FOI and seroprevalence estimates from each model when they were applied to serological data from Indonesia compared to Vietnam, where the former had higher separation of titre distributions (Figure 3, Figure 4, S4 Table).

The FOI for Indonesia was estimated at 15.4% (95% CI: 10.6%-21.3%), 14.3% (13.6%-15.0%) and 16.4% (2.2%-81.4%), and the seroprevalence at 71.8% (69.4%-74.1%), 70% (68.6%-71.4%) and 70.0% (65.5%-74.3%) by the mixture model and catalytic model under the time-constant and time-varying FOI assumptions, respectively. The FOI estimates are consistent with previously published results from catalytic models fitted to case-notification data from 2008-2017 in Jakarta, Indonesia (13.0%, 95% CI: 12.9-13.1%) (12). The seroprevalence estimates are consistent with previously published results using time-constant catalytic models applied to the same serology dataset (Dataset B) (31,39). Our results show that the mixture and catalytic models do not significantly differ in their FOI and seroprevalence estimates in this setting.

The mixture model applied to the six datasets from Vietnam produced more variable estimates (FOI range = 2.6%-9.9%, seroprevalence range = 16%.3-37.6%) than the catalytic models (FOI range = 2.3%-3.7% and 2.4%-5.0%, seroprevalence range = 19.0%-30.0% and 18.9%-29.9% under the assumption of a time-constant or time-varying FOI respectively). The estimates from the mixture model tended to exceed those obtained from the catalytic models (Figure 4, S4 Table). Given the negative bias observed for the catalytic models in our simulation study, we expect the higher mixture model estimates to be more accurate for the Vietnamese setting. The higher FOI estimates are closer to previously published estimates of 11.7% (95% CI: 10.8-12.7%) (40) using data collected from Binh Thuan province in 2003, and 14.2% (95% CI: 12.9-18.4%) using data collected from Dong Thap province in 1996-7 (19). Previous work has demonstrated large spatiotemporal heterogeneity of DENV in Vietnam (37).

The variance between our mixture and catalytic model estimates for Vietnam is greatest in the age-specific seroprevalence and FOI estimates (Figure 5, S9 Figure). As expected, the time-varying catalytic model and the mixture model (which implicitly models FOI as time-varying), are better able to capture the age-specific seroprevalence than the time-constant catalytic model. However, the time-varying catalytic model produced less stable estimates (wider CIs) than the time-constant catalytic model or the mixture model (Figure 1 and Figure 2).

A major advantage of the mixture model is the comparative ease with which it can be applied to serology data to estimate transmission intensity without the need to use thresholds to process the data. Furthermore, to generate robust estimates, there are fewer data requirements for mixture models than for catalytic models: in the former, the data are pooled and age is used only to calculate the age-specific mean log(titre + 1) using a spline, meaning that there are no constraints on the number of participants per age category. However, it is important to consider the bias that will be introduced if the mixture distributions fit the titre data poorly (38). Biggs et al. fit mixture distributions to DENV antibody titre data in the Philippines to develop a framework distinguishing between post and primary DENV infection (29) by specifying a three-component mixture for seronegative, seropositive with a primary infection and seropositive with post-primary infections. In future work, it would be of interest to explore the FOI estimates obtained on the Vietnamese datasets with a similar three-component mixture model.

Our results suggest that mixture models represent a good alternative to catalytic models to quantify DENV time-varying FOI and seroprevalence from age-stratified serological data, with potentially less bias and less uncertainty. They may be particularly useful when estimating FOI in low transmission settings and/or when the study population is younger, where there is higher overlap between the component distributions (S8 Figure) so the risk of serostatus misclassification and bias introduction when using cut-off threshold methods is greater (S6 Figure and S7 Figure). We have provided code for both models to encourage further exploration and comparison of the two methods. Critically, further investigation of the use of mixture models depends on the availability of raw antibody titre data. For these reasons, we would encourage current and future seroprevalence studies on DENV as well as other infectious diseases, to publish anonymised individual-level antibody titre data where it is possible to do so.

Data Availability

The serological data from Vietnam and Indonesia used in this work is proprietary and cannot be publicly shared. Qualified researchers may request access to the data provider. Further details on Sanofi's data sharing criteria and process for requesting access can be found at: https://www.clinicalstudydatarequest.com. The simulated data can be recreated using code that we have made available in the GitHub repository.

https://github.com/Tori-Cox/Mixture-catalytic-models

Funding

This work was supported by the MRC Centre for Global Infectious Disease Analysis (reference MR/R015600/1), jointly funded by the UK Medical Research Council (MRC) and the UK Foreign, Commonwealth & Development Office (FCDO), under the MRC/FCDO Concordat agreement and is also part of the EDCTP2 programme supported by the European Union. I.D. acknowledges research funding from a Sir Henry Dale Fellowship funded by the Royal Society and Wellcome Trust [grant 213494/Z/18/Z]. V.C. acknowledges funding from the Wellcome Trust [grant 222375/Z/21/Z]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgements

We would like to acknowledge Sanofi Pasteur for provision of the data used in this research. We acknowledge the participants, their families and the principal investigators involved in the studies.

References

  1. 1.↵
    Brady OJ, Hay SI. The Global Expansion of Dengue: How Aedes aegypti Mosquitoes Enabled the First Pandemic Arbovirus. Annu Rev Entomol. 2020;65(1):1–18.
    OpenUrl
  2. 2.↵
    Simmons CP, Farrar JJ, van Vinh Chau N, Wills B. Dengue. N Engl J Med [Internet]. 2012 Apr 12;366(15):1423–32. Available from: http://journals.sagepub.com/doi/10.1177/1461444810365020
    OpenUrl
  3. 3.↵
    Cattarino L, Rodriguez-Barraquer I, Imai N, Cummings DAT, Ferguson NM. Mapping global variation in dengue transmission intensity. Sci Transl Med. 2020;12(528):1–11.
    OpenUrlCrossRefPubMed
  4. 4.↵
    Gibbons R V. Dengue conundrums. Int J Antimicrob Agents [Internet]. 2010;36(SUPPL. 1):S36–9. Available from: http://dx.doi.org/10.1016/j.ijantimicag.2010.06.019
    OpenUrl
  5. 5.↵
    Fritzell C, Rousset D, Adde A, Kazanji M, Van Kerkhove MD, Flamand C. Current challenges and implications for dengue, chikungunya and Zika seroprevalence studies worldwide: A scoping review. PLoS Negl Trop Dis. 2018;12(7):1–29.
    OpenUrlCrossRefPubMed
  6. 6.↵
    Gubler DJ. Dengue, Urbanization and globalization: The unholy trinity of the 21 st century. Trop Med Health. 2011;39(4 SUPPL.):3–11.
    OpenUrlCrossRefPubMed
  7. 7.↵
    Vannice KS, Durbin A, Hombach J. Status of vaccine research and development of vaccines for dengue. Vaccine [Internet]. 2016;34(26):2934–8. Available from: http://dx.doi.org/10.1016/j.vaccine.2015.12.073
    OpenUrl
  8. 8.↵
    Luo R, Fongwen N, Kelly-Cirino C, Harris E, Wilder-Smith A, Peeling RW. Rapid diagnostic tests for determining dengue serostatus: a systematic review and key informant interviews. Clin Microbiol Infect [Internet]. 2019;25(6):659–66. Available from: https://doi.org/10.1016/j.cmi.2019.01.002
    OpenUrl
  9. 9.↵
    Kucharski AJ, Kama M, Watson CH, Aubry M, Funk S, Henderson AD, et al. Using paired serology and surveillance data to quantify dengue transmission and control during a large outbreak in Fiji. Elife. 2018;7:1–26.
    OpenUrlCrossRefPubMed
  10. 10.↵
    O’Reilly KM, Hendrickx E, Kharisma DD, Wilastonegoro NN, Carrington LB, Elyazar IRF, et al. Estimating the burden of dengue and the impact of release of wMel Wolbachia-infected mosquitoes in Indonesia: A modelling study. BMC Med. 2019;17(1):1–14.
    OpenUrlCrossRefPubMed
  11. 11.↵
    Lauer SA, Sakrejda K, Ray EL, Keegan LT, Bi Q, Suangtho P, et al. Prospective forecasts of annual dengue hemorrhagic fever incidence in Thailand, 2010–2014. Proc Natl Acad Sci U S A. 2018;115(10):E2175–82.
    OpenUrlAbstract/FREE Full Text
  12. 12.↵
    O’Driscoll M, Imai N, Ferguson N, Hadinegoro SR, Satari HI, Tam C, et al. Spatiotemporal Variability in Dengue Transmission Intensity in Jakarta, Indonesia. 2018;1–62.
  13. 13.↵
    Hens N, Aerts M, Faes C, Shkedy Z, Lejeune O, Van Damme P, et al. Seventy-five years of estimating the force of infection from current status data. Epidemiol Infect. 2010;138(6):802–12.
    OpenUrlCrossRefPubMedWeb of Science
  14. 14.↵
    Ferguson NM, Donnelly CA, Anderson RM. Transmission dynamics and epidemiology of dengue: Insights from age-stratified sero-prevalence surveys. Philos Trans R Soc B Biol Sci. 1999;354(1384):757–68.
    OpenUrlCrossRefPubMedWeb of Science
  15. 15.↵
    Grenfell BYBT, Anderson RM. The estimation of age-related rates of infection from case notifications and serological data. 1985;(1985):419–30.
  16. 16.↵
    Nokes BYDJ, Anderson RM. Rubella epidemiology in South East England. 1986;(1986):291–304.
  17. 17.↵
    Schenze D, Dietz K, Frösner G. Antibody against Hepatitis A in Seven European Countries: II. Statistical Analysis of Cross-Sectional surveys. Am J Epidemiol [Internet]. 1979;10(1):70–76. Available from: https://academic.oup.com/aje/article/110/1/70/62356
    OpenUrl
  18. 18.↵
    Delgado S, Neyra RC, Machaca VRQ, Juárez JA, Chu LC, Verastegui MR, et al. A history of Chagas disease transmission, control, and re-emergence in peri-rural La Joya, Peru. PLoS Negl Trop Dis. 2011;5(2).
  19. 19.↵
    Imai N, Dorigatti I, Cauchemez S, Ferguson NM. Estimating Dengue Transmission Intensity from Sero-Prevalence Surveys in Multiple Countries. PLoS Negl Trop Dis [Internet]. 2015;9(4):1–19. Available from: http://dx.doi.org/10.1371/journal.pntd.0003719
    OpenUrl
  20. 20.
    Salje H, Paul KK, Paul R, Rodriguez-Barraquer I, Rahman Z, Alam MS, et al. Nationally-representative serostudy of dengue in Bangladesh allows generalizable disease burden estimates. Elife. 2019;8:1–17.
    OpenUrlCrossRefPubMed
  21. 21.
    Rodriguez-Barraquer I, Salje H, Cummings DA. Opportunities for improved surveillance and control of dengue from age-specific case data. Elife. 2019;8.
  22. 22.↵
    Bollaerts K, Aerts M, Shkedy Z, Faes C, Beutels P, Hens N. of infection directly from antibody titres ER ER. Stat Modelling. 2012;12(5):441–62.
    OpenUrlCrossRefPubMed
  23. 23.↵
    Vink MA, Kassteele J Van De, Wallinga J, Teunis PFM, Bogaards JA. Estimating Seroprevalence of Human Papillomavirus Type 16 Using a Mixture Model with Smoothed Age-dependent Mixing Proportions. 2015;26(1).
  24. 24.↵
    Hens N, Shkedy Z, Aerts M, Faes C, Damme P Van, Beutels P. Modeling Infectious Disease Parameters Based on Serological and Social Contact Data: A Modern Statistical Perspective (Google eBook). 2012;314. Available from: http://books.google.com/books?id=IH08pTAoe6QC&pgis=1
  25. 25.↵
    Gay NJ. Analysis of serological surveys using mixture models: Application to a survey of parvovirus B19. Stat Med. 1996;15(14):1567–73.
    OpenUrlCrossRefPubMedWeb of Science
  26. 26.↵
    Hardelid P, Williams D, Dezateux C, Tookey PA, Peckham CS, Cubitt WD, et al. Analysis of rubella antibody distribution from newborn dried blood spots using finite mixture models. Epidemiol Infect. 2008;136(12):1698–706.
    OpenUrlPubMed
  27. 27.↵
    Rota MC, Massari M, Gabutti G, Guido M, De Donno A, Atti MLC degli. Measles serological survey in the Italian population: Interpretation of results using mixture model. Vaccine. 2008;26(34):4403–9.
    OpenUrlCrossRefPubMedWeb of Science
  28. 28.↵
    Chisenga CC, Bosomprah S, Musukuma K, Mubanga C, Chilyabanyama ON, Velu RM, et al. Sero-prevalence of arthropod-borne viral infections among Lukanga swamp residents in Zambia. PLoS One [Internet]. 2020;15(7):1–13. Available from: http://dx.doi.org/10.1371/journal.pone.0235322
    OpenUrl
  29. 29.↵
    Biggs JR, Sy AK, Brady OJ, Kucharski AJ, Funk S, Reyes MAJ, et al. A serological framework to investigate acute primary and post-primary dengue cases reporting across the Philippines. BMC Med. 2020;18(1):1–14.
    OpenUrlCrossRefPubMed
  30. 30.↵
    Tien NTK, Luxemburger C, Toan NT, Pollissard-Gadroy L, Huong VTQ, Van Be P, et al. A prospective cohort study of dengue infection in schoolchildren in Long Xuyen, Viet Nam. Trans R Soc Trop Med Hyg [Internet]. 2010;104(9):592–600. Available from: http://dx.doi.org/10.1016/j.trstmh.2010.06.003
    OpenUrl
  31. 31.↵
    Prayitno A, Taurel AF, Nealon J, Satari HI, Karyanti MR, Sekartini R, et al. Dengue seroprevalence and force of primary infection in a representative population of urban dwelling Indonesian children. PLoS Negl Trop Dis. 2017;11(6):1–16.
    OpenUrlCrossRef
  32. 32.↵
    R Core Team. R: A language and environment for statistical computing. [Internet]. 2019. Available from: https://www.r-project.org/
  33. 33.↵
    Rodriguez-Barraquer I, Cordeiro MT, Braga C, de Souza W V., Marques ET, Cummings DAT. From re-emergence to hyperendemicity: The natural history of the dengue epidemic in Brazil. PLoS Negl Trop Dis. 2011;5(1):1–7.
    OpenUrlCrossRef
  34. 34.↵
    Macdonald P, Du J. mixdist: Finite Mixture Distribution [Internet]. 2018. Available from: https://cran.r-project.org/package=mixdist, package version 0.5-5.%0A
  35. 35.↵
    Eilers PHC, Marx BD, Ellers PHC. Linked references are available on JSTOR for this article□: Flexible Smoothing with B-splines and Penalties. Stat Sci. 1996;11(2):89–102.
    OpenUrlCrossRefWeb of Science
  36. 36.↵
    Kovac T. serostat: Modeling Infectious Disease Parameters Based on Serological and Social Contact. 2018.
  37. 37.↵
    Rabaa MA, Simmons CP, Fox A, Le MQ, Nguyen TTT, L. HY, et al. Dengue Virus in Sub-tropical Northern and Central Viet Nam: Population Immunity and Climate Shape Patterns of Viral Invasion and Maintenance. PLoS Negl Trop Dis. 2013;7(12).
  38. 38.↵
    Kafatos G, Andrews NJ, Conway KJMC, Maple PAC. Is it appropriate to use fi xed assay cut-offs for estimating seroprevalence□? 2016;(2016):887–95.
  39. 39.↵
    Tam CC, O’Driscoll M, Taurel AF, Nealon J, Hadinegoro SR. Geographic variation in dengue seroprevalence and force of infection in the urban paediatric population of Indonesia. PLoS Negl Trop Dis. 2018;12(11):1–12.
    OpenUrlCrossRefPubMed
  40. 40.↵
    Thai KTD, Binh TQ, Giao PT, Phuong HL, Hung LQ, Van Nam N, et al. Seroprevalence of dengue antibodies, annual incidence and risk factors among children in southern Vietnam. Trop Med Int Heal. 2005;10(4):379–86.
    OpenUrl
Back to top
PreviousNext
Posted October 30, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Estimating dengue transmission intensity from serological data: a comparative analysis using mixture and catalytic models
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Estimating dengue transmission intensity from serological data: a comparative analysis using mixture and catalytic models
Victoria Cox, Megan O’Driscoll, Natsuko Imai, Ari Prayitno, Sri Rezeki Hadinegoro, Anne-Frieda Taurel, Laurent Coudeville, Ilaria Dorigatti
medRxiv 2021.10.29.21259708; doi: https://doi.org/10.1101/2021.10.29.21259708
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Estimating dengue transmission intensity from serological data: a comparative analysis using mixture and catalytic models
Victoria Cox, Megan O’Driscoll, Natsuko Imai, Ari Prayitno, Sri Rezeki Hadinegoro, Anne-Frieda Taurel, Laurent Coudeville, Ilaria Dorigatti
medRxiv 2021.10.29.21259708; doi: https://doi.org/10.1101/2021.10.29.21259708

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Addiction Medicine (160)
  • Allergy and Immunology (412)
  • Anesthesia (90)
  • Cardiovascular Medicine (855)
  • Dentistry and Oral Medicine (156)
  • Dermatology (97)
  • Emergency Medicine (247)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (392)
  • Epidemiology (8534)
  • Forensic Medicine (4)
  • Gastroenterology (381)
  • Genetic and Genomic Medicine (1739)
  • Geriatric Medicine (167)
  • Health Economics (370)
  • Health Informatics (1234)
  • Health Policy (618)
  • Health Systems and Quality Improvement (467)
  • Hematology (196)
  • HIV/AIDS (369)
  • Infectious Diseases (except HIV/AIDS) (10271)
  • Intensive Care and Critical Care Medicine (552)
  • Medical Education (192)
  • Medical Ethics (51)
  • Nephrology (210)
  • Neurology (1666)
  • Nursing (97)
  • Nutrition (247)
  • Obstetrics and Gynecology (325)
  • Occupational and Environmental Health (450)
  • Oncology (925)
  • Ophthalmology (262)
  • Orthopedics (100)
  • Otolaryngology (172)
  • Pain Medicine (110)
  • Palliative Medicine (40)
  • Pathology (249)
  • Pediatrics (534)
  • Pharmacology and Therapeutics (246)
  • Primary Care Research (205)
  • Psychiatry and Clinical Psychology (1757)
  • Public and Global Health (3826)
  • Radiology and Imaging (622)
  • Rehabilitation Medicine and Physical Therapy (317)
  • Respiratory Medicine (518)
  • Rheumatology (207)
  • Sexual and Reproductive Health (164)
  • Sports Medicine (156)
  • Surgery (190)
  • Toxicology (36)
  • Transplantation (100)
  • Urology (74)