Abstract
Background We previously assessed the effect of an onsite sanitation intervention in informal neighborhoods of urban Maputo, Mozambique on enteric pathogen detection in children after two years of follow-up (Maputo Sanitation [MapSan] trial, clinicaltrials.gov: NCT02362932). We found significant reductions in Shigella and Trichuris prevalence but only among children born after the intervention was delivered. In this study, we assess the health impacts of the sanitation intervention after five years among children born into study households post-intervention.
Methods We are conducting a cross-sectional household study of enteric pathogen detection in child stool and the environment in the 16 MapSan study neighborhoods at compounds (household clusters sharing sanitation and outdoor living space) that received the pour-flush toilet and septic tank intervention at least five years prior or meet the original criteria for trial control sites. We are enrolling at least 400 children (ages 29 days – 60 months) in each treatment arm. Our primary outcome is the prevalence of 22 bacterial, protozoan, and soil transmitted helminth (STH) enteric pathogens in child stool using the pooled odds ratio across the outcome set to assess the intervention effect. Secondary outcomes include the individual pathogen detection prevalence and gene copy density of 27 enteric pathogens (including viruses); mean height-for-age (HAZ), weight-for-age (WAZ), and weight-for-height (WHZ) z-scores; prevalence of stunting, underweight, and wasting; and the 7-day period-prevalence of caregiver-reported diarrhea. All analyses are adjusted for pre-specified covariates and examined for effect measure modification by age. Environmental samples from study households and the public domain are assessed for pathogens and fecal indicators to explore environmental exposures and monitor disease transmission.
Discussion We are using stool-based detection of multiple enteric pathogens as an objective measure of exposure to evaluate an existing sanitation intervention. We propose a novel pooled estimate of the treatment effect across a pre-specified outcome set to summarize the overall impact of the intervention on enteric pathogen exposure as the primary trial outcome.
Trial Registration This trial was prospectively registered on 16 March 2022 (ISRCTN86084138).
Background
There is a high burden of childhood enteric infections associated with poor environmental conditions, with multiple enteric pathogens frequently detected in stool within the first year of life [1–3]. This is also the period with the highest incidence of diarrhea, which remains a leading cause of child mortality in low- and middle-income countries (LMIC) and has long been associated with stunted growth [4–6]. However, many diarrheal episodes are not attributable to infectious etiologies [3, 7], while the etiology of attributable diarrhea varies widely by setting and only certain diarrheal pathogens are consistently implicated in reduced linear growth [2, 8, 9]. Far more prevalent is asymptomatic enteric pathogen shedding in stool [3, 7, 10–14], which may be more strongly associated with poor growth than diarrheal illness [2, 15–17], potentially by contributing to intestinal inflammation, gut permeability, and nutrient malabsorption in a condition known as environmental enteric dysfunction (EED) [18–21]. In addition to the diverse negative impacts of stunting [22], specific adverse health outcomes associated with enteric infection and EED include delayed cognitive development [23–25] and reduced oral vaccine efficacy [26, 27].
Water, sanitation, and hygiene (WASH) interventions aim to prevent diarrhea and improve child growth by interrupting fecal-oral pathogen transmission [28]. However, recent rigorous WASH intervention trials have demonstrated inconsistent and often limited impacts on child health [29–34, 13]. Methodological limitations of such distal outcome measures notwithstanding [35, 36], these findings are consistent with the interventions insufficiently interrupting environmental transmission and exposure to enteric pathogens [37]. Combined WASH interventions with high fidelity and adherence reduced stool-based detection of Giardia, hookworm, and possibly Ascaris among 30 month-old children (but not younger children) and enteric viruses in rural Bangladesh [10, 38, 39], Ascaris in rural Kenya [40], and the number of co-detected parasites in rural Zimbabwe [11], but not bacterial pathogens or those most associated with stunting in any setting [2]. The sanitation-only intervention arm in rural Bangladesh reduced Trichuris prevalence in child stool [38] and may have reduced pathogenic Escherichia coli on child hands in more crowded households [41], while a sanitation intervention in rural Cambodia did not impact any pathogens measured in child stool [13]. Viewed collectively, a clear picture emerges of pervasive childhood polymicrobial exposures that were not meaningfully prevented by low-cost WASH interventions [42–44].
The preceding trials were all conducted in rural settings, but rapid urbanization has led to exceptional growth of densely populated informal settlements that lack basic services and present unique health challenges [45, 46]. We have previously investigated whether an onsite sanitation intervention delivered in low-income neighborhoods of urban Maputo, Mozambique reduced enteric pathogen prevalence in child stool [47]. Although we similarly found no evidence of an effect on combined prevalence of pre-specified enteric pathogens in our primary analyses [34], additional evidence suggests that the sanitation intervention may have reduced exposures to enteric pathogens in the environment. The intervention was delivered with high fidelity and was widely used by intervention households after two years [48]. Exploratory sub-group analyses indicated that 24 months after the intervention, Shigella prevalence was halved (−51%; 95% confidence interval [CI]: −15%, −72%) and Trichuris prevalence reduced by three-quarters (− 76%; 95% CI: −40%, −90%) among children born into the study compounds after the intervention was implemented, relative to children born into the control compounds after baseline [34].
Furthermore, overall pathogen prevalence, pathogen counts, E. coli gene copy density, and the individual prevalence of Ascaris and pathogenic E. coli were all significantly reduced in soil at the intervention latrine entrance [49, 50], suggesting the intervention effectively contained human excreta. While animals have been implicated as major sources of pathogen exposure in other settings [51], only companion animals were frequently present at study households and a locally validated indicator of poultry fecal contamination (the most commonly observed non-companion animal type) was rarely detected in household environments [52]. Conversely, indicators of human fecal contamination were widespread [50] and the two pathogens most impacted by the intervention—Shigella and Trichuris—are considered anthroponotic [53], suggesting that human excreta was a primary source of enteric pathogens in our dense, urban setting and that human-associated pathogens were impacted by sustained exposure to the intervention.
Enteric pathogen carriage is highly age-dependent [8, 12], but we were previously limited by the timing of our follow-up survey (24-months post-intervention) to children under two years of age when assessing impacts on those born into the intervention. Furthermore, it has been suggested that the typically short (1 – 2 year) follow-up periods often used in WASH evaluations may be inadequate for any longer-term benefits to manifest, potentially contributing to the limited observed effects [42]. Accordingly, we are conducting a cross-sectional follow-up study to better understand the long-term impacts of the sanitation intervention five years after it was implemented on child enteric pathogen exposures and shedding, diarrheal disease, and growth among children born into study households post-intervention.
Study Objectives
Primary objective
To measure the long-term effect of a shared, onsite urban sanitation intervention on the pooled prevalence of pre-specified enteric pathogen targets detected in children’s stool at least 60 months post intervention.
Hypotheses
H1. The risk of stool-based enteric pathogen detection among children 29 days – 60 months old is reduced for children born into households that previously received the sanitation intervention.
H2. Children born into households that previously received the sanitation intervention experience delayed exposure to enteric pathogens relative to comparably aged children from non-intervention households, reflected in a greater reduction in the risk of enteric pathogen detection among younger age groups and attenuated reduction in risk among older children.
Secondary objectives
To measure the effect of a shared, onsite urban sanitation intervention program on the individual prevalence and density of pre-specified enteric pathogen targets detected in children’s stool and environmental samples at least 60 months post intervention.
To measure the effect of a shared, onsite urban sanitation intervention program on child growth and the prevalence of caregiver-reported diarrhea in children at least 60 months post intervention.
Methods
Study setting
This household-based study is being conducted in 11 bairros (neighborhoods) in the Nhlamankulu District and 5 bairros in the KaMaxaquene District of Maputo city, Mozambique. Households in these densely populated, low-income neighborhoods typically cluster into self-defined compounds delineated by a wall or boundary that share outdoor living space and sanitation facilities. Study activities with participating households are conducted primarily in the compound shared outdoor space by staff from the Centro de Investigação e Treino em Saúde da Polana Caniço (Polana Caniço Health Research and Training Center; CISPOC) in Maputo, Mozambique, from which the study is implemented and managed.
Study design
Urban sanitation intervention
The nongovernmental organization (NGO) Water and Sanitation for the Urban Poor (WSUP) implemented a sanitation intervention in 2015-2016 at approximately 300 compounds (groups of households typically delineated by a wall or barrier, that shared sanitation and outdoor living space) in densely populated, low-income, informal urban neighborhoods of Maputo under a larger program led by the Water and Sanitation Programme (WSP) of the World Bank [47]. In these compounds, ranging between three and twenty-five households each, existing unhygienic shared latrines were replaced with pour-flush toilets and a septic tank with a soak-away pit for the liquid effluent. Two intervention designs employing the same sanitation technology were implemented, with approximately 50 communal sanitation blocks (CSBs) and 250 shared latrines (SLs) constructed across 11 neighborhoods that exhibited diversity across density and other key characteristics (susceptibility to flooding, relative poverty, access to water and sanitation infrastructure) [12, 52, 54]. SLs served compounds with fewer than 21 residents and provided a single cabin with the intervention toilet, while CSBs provided an additional cabin for every 20 compound residents and other amenities including rainwater harvesting, municipal water connections and storage, and washing, bathing and laundry facilities [34, 55].
Original controlled before-and-after study
To evaluate the impact of the WSUP sanitation intervention on child health, we conducted the Maputo Sanitation (MapSan) trial, a controlled before-and-after study of enteric pathogen detection in child stool (clinicaltrials.gov: NCT02362932) [47]. WSUP selected intervention compounds using the following criteria: (1) residents shared sanitation in poor condition as determined by an engineer; (2) the compound was located in the pre-defined implementation neighborhoods; (3) there were no fewer than 12 residents; (4) residents were willing to contribute financially to construction costs; (5) sufficient space was available for construction of the new facility; (6) the compound was accessible for transportation of construction materials and tank-emptying activities; (7) the compound had access to a legal piped water supply; and (8) the groundwater level was deep enough for construction of a septic tank. Control compounds were selected according to criteria 1, 3, 4, and 7 from the 11 neighborhoods where the intervention was implemented and 5 additional neighborhoods with comparable characteristics [12].
The MapSan trial recruited an open cohort at three time-points: baseline (pre-intervention), 12 months post-intervention, and 24 months post-intervention; children were eligible to participate if older than 1 month at the time of enrollment and under 48 months at baseline. We enrolled intervention and control compounds concurrently to limit any differential effects of seasonality or other secular trends on the outcomes. We found no evidence that the sanitation intervention reduced the combined prevalence of 12 bacterial and protozoan enteric pathogens (the pre-specified primary outcome), the individual prevalence of any single bacterial, protozoan, STH, or viral pathogen, or the period prevalence of caregiver-reported diarrhea 12 and 24 months post-intervention [34]. However, exploratory analyses indicated the intervention may have been protective against bacterial and soil-transmitted helminth infections among the cohort who were born into the intervention and may have reduced the spread of some pathogens into latrine entrance soils.
Long-term cross-sectional follow-up study
We are revisiting MapSan trial compounds at least five years after the sanitation intervention to conduct a cross-sectional survey of children who were born after the intervention was implemented. Due to substantial population turnover [34], both intervention and control compounds are being identified using previously collected geolocation data and their locations confirmed by study staff during site identification visits. Any compounds identified in the study neighborhoods during the course of these mapping efforts that have received the sanitation intervention (built by WSUP to the same specifications) or match the original enrollment criteria for MapSan control compounds are also invited to participate, irrespective of previous participation in the MapSan trial. We anticipate enrollment to continue for approximately one year and are again enrolling intervention and control compounds concurrently to limit any differential effects of seasonality and other secular trends across the enrollment period.
Eligibility criteria
We attempt to enroll all eligible children in each compound that either previously participated in the MapSan trial, has previously received an intervention latrine, or meets the original enrollment criteria for MapSan control compounds [34]. Participant inclusion criteria include:
Child aged 29 days – 60 months old
Born into and residing in a compound enrolled in the five-year follow-up study; in intervention compounds, the child must have been born following the delivery of the sanitation intervention
Has continuously resided in the study compound for the preceding 6 months, or since birth if under 6 months of age
Has a parent or guardian who is able to understand and complete the written informed consent process and allow their child to participate Children are excluded if they have any caregiver-indicated medical condition or disability that precludes participation in the study.
Participant enrollment
Enrollment is conducted by trained study staff in either Portuguese or Changana, according to the respondent’s preference, with written materials provided in Portuguese. We first obtain verbal consent from the compound leader to approach households in the compound for enrollment. We then seek written, informed consent to participate from the parent or guardian of each eligible child. Participation is entirely voluntary; guardians may decline their child’s participation for any reason (and are under no obligation to provide a reason) and can withdraw their children at any point during the study. We began enrolling participants on 28 March 2022.
Household visits and procedures
After initially locating study compounds, the household of a participating child is visited twice, typically on two consecutive days. On the first day, trained study staff conduct written consent procedures; administer compound, household, and child questionnaires; record child anthropometry measures; collect environmental samples (in a sub-set of households); and request the child’s caregiver to retain a sample of the child’s stool. The household is revisited the following day to collect a stool sample from the child and complete environmental sample collection, if necessary. If the stool sample is unavailable on the second day, the study staff coordinate an additional visit child’s caregiver to retrieve the stool. In the event that 7 or more days pass since the initial visit without collection of a stool sample, a qualified nurse visits the child to obtain a rectal swab, which is explained during the initial informed consent process along with all other household visit procedures.
After collection of the stool sample, deworming is offered to all household members >1 year old who have not been dewormed in the past year, with the exception of pregnant and breastfeeding women. Deworming consultation and medication provision is conducted by Ministry of Health staff following the national guidelines for deworming procedures. Deworming is offered in-kind to all household members and leverages the household interaction to provide an important public health service. Besides deworming, this study offers no direct benefit to children participating in this study. No incentives are provided to study participants, but we provide 50 meticais (approximately US$1) of mobile phone credit on the caregiver’s preferred network for each child to compensate for the costs incurred in communicating with the study team to arrange household visits and retrieve the child’s stool. Households from which we collect food and/or large-volume water samples are reimbursed an additional 50 meticais to offset these expenses.
Environmental sample collection
We are sampling environmental compartments at a randomly selected subset of 100 intervention and 100 control compounds to represent compound- and household-level exposures [37]. At the entrance to the compound latrine we collect soil, flies, and a large volume air sample, as well as fecal sludge from the latrine or septic tank and any animal feces observed in the shared outdoor space [49, 50, 56, 57]. One household is selected at random among the households in the compound with children enrolled in the child health study, from which we collect swabs of flooring at the household entrance, flies in cooking area, prepared child’s food, stored drinking water, and water from the household’s primary source [52, 58, 59].
We are also collecting environmental samples from the public domain, including wastewater, open surface water, and soils in the vicinity of solid and fecal waste disposal locations, as well as wastewater effluent from hospitals treating COVID-19 patients. These samples are anonymous, are not linked to specific individuals in any way, and are used to conduct environmental surveillance of pathogens circulating in the community [56, 60].
Study outcomes
Enteric pathogen detection in stool
Stool-based molecular detection is performed for 27 enteric pathogens commonly implicated in both symptomatic and asymptomatic childhood infections globally, including those identified at the Global Enteric Multicenter Study (GEMS) study site in Manhiça, Mozambique [12, 36, 56, 61]. Reverse-transcription quantitative polymerase chain reaction (RT-qPCR) is conducted by custom TaqMan Array Card (TAC; Thermo Fisher, Carlsbad, CA, USA) to simultaneously quantify genetic targets corresponding to 13 bacterial pathogens (Aeromonas spp.; Campylobacter jejuni/coli; Clostridioides difficile; E. coli O157; enteroaggregative E. coli (EAEC); enteropathogenic E. coli (EPEC); enterotoxigenic E. coli (ETEC); Shiga toxin-producing E. coli (STEC); enteroinvasive E. coli (EIEC)/Shigella spp.; Helicobacter pylori; Plesiomonas shigelloides; Salmonella enterica; Vibrio cholerae), 4 protozoan parasites (Cryptosporidium spp.; Cyclospora cayetanensis; Entamoeba histolytica; Giardia spp.), 5 soil transmitted helminths (Ascaris lumbricoides; Ancylostoma duodenale; Necator americanus; Strongyloides stercolaris; Trichuris trichiura), and 5 enteric viruses (adenovirus 40/41; astrovirus; norovirus GI/GII; rotavirus; sapovirus) [62]. We also include the respiratory virus SARS-CoV-2 on the custom TAC, which is shed in the stool of many infected individuals, enabling surveillance through fecal waste streams [63, 64].
Our primary outcome is the prevalence of a pre-specified subset of 22 enteric pathogens; as in the original MapSan study, we exclude enteric viruses and SARS-CoV-2 from the primary outcome due to the potential for direct contact transmission, which is unlikely to be impacted by the intervention [34, 47]. Because we anticipate a combined prevalence (detection of at least one of the 22 primary outcome pathogens in a given stool sample) near 100%, and the intervention could plausibly increase the prevalence of some pathogens while reducing others, we do not define a composite prevalence outcome [34, 65]. Rather, we use mixed effects models to analyze all pathogens concurrently, thereby estimating the pooled effect of the intervention on enteric pathogen prevalence across the outcome set [66–68].
Secondary outcomes include the individual prevalence of all 27 pre-specified enteric pathogens, including Shigella spp. and T. trichiura, the two pathogens most impacted at 24 months among children born after the intervention was delivered [34]. A similar mixed effects model is used to analyze all pathogens concurrently to estimate separate treatment effects on the prevalence of each pathogen while accounting for correlations between outcomes, which alleviates concerns about post-hoc adjustments for multiple comparisons [10, 66]. Additional secondary outcomes include the continuous gene copy density in stool of the 22 primary outcome pathogens, for which a pooled treatment effect on the mean gene copy density is estimated, as well as the individual gene copy densities for all 27 enteric pathogens assessed on the TAC.
Anthropometry
Child weight and recumbent length (child age < 24 months) or standing height (24 – 60 months) are assessed according to standard World Health Organization (WHO) protocols and transformed to age-adjusted z-scores using WHO reference populations to obtain height-for-age (HAZ), weight-for-age (WAZ), and weight-for-height (WHZ) z-scores [69, 70]. Secondary outcomes include continuous HAZ, WAZ, and WHZ, as well as prevalence of binary growth outcomes stunting (HAZ < −2), underweight (WAZ < −2), and wasting (WHZ < −2) [13].
Caregiver reported illness
Caregiver surveys are administered to ascertain child diarrheal disease, defined as the passage of three or more loose or watery stools in a 24-hour period, or any bloody stool, in the past 7 days [71]. We also assess two caregiver-reported negative control outcomes for each child: the 7-day period-prevalence of bruises, scrapes, or abrasions and the 7-day period-prevalence of toothache [72]. We do not expect the intervention to impact either child bruising or toothache prevalence, so significant differences in these outcomes by treatment arm would suggest possible bias in our caregiver-reported outcomes.
Pathogen detection in environmental matrices
Molecular detection of selected enteric and non-enteric pathogens and fecal source tracking (FST) markers is performed for environmental samples from both the private (compound & household) and public domains using a second custom TAC [49, 56, 73, 74]. A subset of the enteric pathogens assessed in stool are included on the environmental TAC (adenovirus 40/41; astrovirus; norovirus GI/GII; rotavirus; sapovirus; Aeromonas spp.; C. jejuni/coli; C. difficile; E. coli O157; EAEC; EPEC; ETEC; STEC; EIEC/Shigella spp.; H. pylori; S. enterica; V. cholerae; Cryptosporidium spp.; E. histolytica; Giardia spp.; A. lumbricoides; A. duodenale; N. americanus; T. trichiura), as well as select environmental and zoonotic pathogens (Leptospira spp; Toxocara spp.) [53, 75, 76], other human pathogens detectable in feces (SARS-CoV-2, Zika virus, HIV proviral DNA; Plasmodium spp.; Mycobacterium tuberculosis) [77–80], FST markers (human, poultry, and canine mitochondrial DNA; avian 16S rRNA) [81, 82], and general bacterial and anthropogenic pollution/antimicrobial resistance markers (bacterial 16S rRNA; class 1 integron-integrase gene intl1) [83, 84]. We also culture fecal indicator bacteria (total coliforms and E. coli) using the IDEXX Colilert-18 and Quanti-Tray 2000 system [82, 85]. We conduct ongoing genomic surveillance of SARS-CoV-2 viral lineages in public domain wastewater samples using amplicon-based Illumina next generation sequencing [86].
Statistical analysis
We pre-specified a statistical analysis plan that was deposited in a permanent online repository (https://osf.io/e7pvk/) and linked with the prospective trial registration (ISRCTN86084138) prior to commencing enrollment [68].
Enteric pathogen outcomes
We will use mixed effects models with varying slopes and intercepts to estimate the weighted-average intervention effect across all pathogens included in the model, in essence treating each pathogen as a separate study of the intervention effect on enteric pathogen detection. Additional random effects will be included to account for clustering by compound and child. The prevalence odds ratio (POR) for children living in intervention compounds, relative to children in control compounds, will be estimated as the measure of effect for binary enteric pathogen detection outcomes, including the primary outcome and secondary individual pathogen detection outcomes. Prevalence differences (PD) between children in intervention and control compounds will also be estimated from the posterior predictive distribution at representative values of other model covariates [87]. Mean differences in standard deviation-scaled gene copy density will be estimated as the measure of effect for continuous enteric pathogen outcomes, with non-detects considered true zeros and censoring used to create a zero class (as in Tobit regression) [88].
Effect estimates will be summarized using the mean to represent the expected effect size and the central 95% probability interval to capture the range of effect sizes compatible with the data (the 95% compatibility interval [CI]). Parameters with 95% CIs that exclude the null will be considered significant, although the magnitude and uncertainty of parameter estimates will also be considered holistically in evaluating the evidence for clinically meaningful effects [89].
Growth and caregiver reported outcomes
The effects of the intervention on mean HAZ, WAZ, and WHZ; the prevalence of stunting, underweight, and wasting; and the period-prevalence of caregiver-reported diarrhea and negative control outcomes (bruising, scrapes, and abrasions; toothache) will be analyzed separately as secondary outcomes using generalized estimating equations (GEE) and robust standard errors with exchangeable correlation structure and clustering by compound (the level at which the sanitation intervention was delivered) [13, 34, 90]. The estimated difference in age-adjusted z-scores by treatment assignment will be used as the measure of effect for continuous anthropometry outcomes. The prevalence ratio (PR) will be estimated by Poisson regression for binary growth status and caregiver-reported outcomes. We will not adjust for multiple comparisons [31, 91].
Covariates and effect measure modification
As a cross-sectional study of an existing intervention, all analyses will be adjusted for a set of covariates selected a priori as potential confounders of the sanitation-enteric pathogen carriage relationship [92]. The adjustment set will include child age and sex, caregiver’s education, and household wealth index [34, 93]. Additional covariates will be considered in exploratory adjusted analyses [10, 11]. The specific enteric pathogens detected are expected to be strongly related to child age [8, 34]. We will examine effect measure modification of the primary and secondary outcomes stratifying by age group (1-11 months, 12-23 months, and 24-60 months) [3, 94].
Independently upgraded controls
We anticipate some of the control compounds may have independently upgraded their sanitation facilities to conditions comparable to the intervention. Control compounds with sanitation facilities observed to possess cleanable, intact hardscape slabs; pour-flush or water-sealed toilets; a functional ventilation pipe; and a fixed superstructure with sturdy walls and a secure door that ensure privacy during use are considered to have independently upgraded to conditions comparable to the intervention [55]. Children living in control compounds with independently upgraded latrines are enrolled but will be excluded from the main analyses of the intervention effects. Two sets of subgroup analyses will instead be conducted that include all participants: one in which children in independently upgraded controls are considered as part of the control (non-intervention) arm, and again considered as part of the intervention arm. We will compare parameter estimates from the three sets of analyses (independent upgrades excluded, independent upgrades as controls, and independent upgrades as interventions) to investigate whether the sanitation improvements independently available in the study communities, which may represent more accessible options for achieving greater coverage of high-quality sanitation infrastructure, are comparable to the full sanitation intervention package assessed in the MapSan trial in terms of child health impacts.
Eligible children residing in any compound that has received a WSUP intervention latrine will be considered part of the intervention arm in primary analyses. We are assessing the current conditions of intervention facilities but will not exclude or otherwise adjust for either upgraded or degraded sanitation facilities in intervention compounds in order to evaluate the long-term impacts of the intervention following extended use.
Minimum detectable effect size
The number of participants will be constrained by the number of compounds in the study neighborhoods that have received the sanitation intervention or meet the eligibility requirements for MapSan control compounds, most of which were previously enrolled in the MapSan trial. At the 24 month follow-up, an average of 2.5 children per compound were enrolled from 408 compounds [34]. Compound-level intra-class correlation coefficients (ICCs) were generally less than 0.1 for individual pathogens, corresponding to cluster variances of ∼0.05. We calculate the minimum detectable effect size (MDES) on individual pathogen prevalence with 80% power, 5% significance level, and 0.05 compound cluster variance for a conservative scenario with 200 compounds per treatment arm and 2 children enrolled per compound (for 800 children total, 400 per arm), a moderate scenario of 220 compounds per treatment arm and 2.5 children enrolled per compound (550 children per arm, 1100 total), and a maximal scenario of 300 compounds per arm, 2.5 children per compound (750 children per arm, 1500 total) [95]. Across all scenarios, a minimum baseline (untreated) prevalence of 6-8% is required to reach 80% power for even the largest theoretical effect (nearly 100% reduction). The minimum detectable relative reduction in prevalence decreases (that is, smaller effect sizes are more readily detected) as pathogen prevalence increases towards 100% (Figure 1). The difference between the scenarios on the multiplicative scale is relatively minor, with the relative reduction MDES largely driven by pathogen baseline prevalence. A pathogen with baseline prevalence below 15% must have its prevalence halved (PR < 0.5) in order to attain 80% power, while a 25% reduction is detectable with 80% power for baseline prevalence of 34-46% under the maximal and conservative scenarios, respectively. We expect the simultaneous consideration of multiple pathogens to reduce the MDES for the primary outcome pooled intervention effect by effectively increasing the sample size. Because this pooled effect is dependent on the prevalence of each pathogen considered and the correlations between them, we will conduct simulation analyses to characterize plausible MDES ranges for the pooled primary outcome treatment effect [65, 95].
Range of minimum detectable effect sizes (shaded region) for the percent reduction in pathogen prevalence with 80% power, 5% significance level, and 0.05 cluster variance across three sample size scenarios. The upper edge of the shaded area represents a conservative scenario with 800 total participants (2 per compound, 200 compounds per arm) while the lower edge corresponds to a maximal scenario with 1500 total participants (2.5 per compound, 300 compounds per arm). A moderate scenario with 1100 total participants (2.5 per compound, 220 compounds per arm) is represented by the black curve within the shaded area. The vertical lines show the prevalence of a subset of pathogens assessed in control compound children during the 24-month follow-up in the original MapSan trial. Line color indicates the specific pathogen and line pattern reflects the pathogen class.
Discussion
Defining a primary trial outcome
Caregiver-reported diarrhea has persisted as a primary outcome for health impact evaluations of WASH interventions no doubt in part because it provides single, non-invasive clinically relevant endpoint that can be relatively easily assessed by simple questionnaire. Similar to the fecal indicator paradigm for monitoring fecal pollution, the sheer number of potential pathogens that must be considered to comprehensively assess efforts to interrupt fecal-oral disease transmission demands a simple indicator of intervention impact [82]. Recent developments in molecular diagnostic platforms enabling simultaneous detection of multiple enteric pathogens in stool and environmental matrices have been rapidly adopted [10, 11, 13, 34, 36, 37, 49] and the inherent limitations of symptomatic diarrhea as a trial endpoint increasingly recognized [14, 35]. In addition to providing objective measures of prior pathogen exposure, these efforts have already offered more granular insight into the behaviors and responses of particular organisms in different settings. For example, the reduction in enteric viruses attributable to a combined WASH intervention in rural Bangladesh aligns with the reduction in diarrhea but lack of impact on linear growth observed in the parent trial [10, 31], as enteric viruses drive diarrhea in this setting but are much less associated with stunting than other pathogens that were not affected by the intervention [2].
However, such granularity complicates the interpretation of the intervention’s overall effect on enteric pathogen exposure, yielding as many as several dozen separate—but not entirely independent—estimates of the intervention effect that can be highly heterogenous. Although procedures to control the false discovery rate can alleviate concerns over multiple hypothesis testing [10, 11], the power to detect meaningful effects for each target depends on the background prevalence of each pathogen assessed, which can vary widely by setting and over time [3, 7, 8]. Because multiple comparison adjustments trade power to control false positives and studies are likely to be underpowered to detect individual effects on lower-prevalence pathogens, the potential for magnitude (“Type M”) and sign (“Type S”) errors may be uncomfortably high [96]. That is, the estimated effects on individual pathogens that are large enough to clear a statistical significance threshold are likely to be inflated in magnitude relative to the true value (Type M error) and may also appear to act in the opposite direction than is actually the case (Type S error) [97, 98]. It is quite possible that seemingly strong protective effects against individual pathogens observed in one realization of a study could dissipate or even reverse direction in a later follow-up or replication in a comparable setting and study population.
As an alternative to multiple individual effect estimates, many studies have defined composite outcomes such as the detection of one or more pathogens (possibly of a particular class) [34], the count of co-detected pathogens [11, 49], and the aggregate intensity of a select pathogens of interest [10]. These metrics are accessible and relatively easily interpreted, but also assume homogeneity of the effect across the (equally weighted) component outcomes, with implications for statistical power and sensitivity to the selection of pathogens included in the composite metric [65].
The challenge of synthesizing evidence across heterogenous effect estimates is common throughout the health, social, and environmental sciences, and is typically addressed using meta-analysis, for which a robust and active methodological literature exists [99]. We propose adapting the concept of estimating a pooled effect across multiple separate effect estimates to obtain a summary of an intervention’s impact that incorporates uncertainty and variability in the set of outcomes analyzed [68]. Rather than estimating a pooled effect across multiple studies of similar interventions on the same outcome, we estimate the pooled effect of a single intervention across a set of related outcomes, which may be thought of as the treatment effect on a “generic” enteric pathogen. Just as studies obtaining more precise effect estimates receive greater weight in meta-analyses, the pathogens for which individual effects can be more precisely estimated (such as those with higher background prevalence) contribute more to the pooled estimate. The pooled effect is recovered alongside the individual effects for each pathogen and importantly shares the same scale, allowing this summary metric to be directly interpreted in the context of its components, augmenting rather than replacing the individual estimates. This stands in contrast to composite outcomes, which construct new metrics that are related to, but fundamentally differ from, their component outcomes—the prevalence of any pathogen being conceptually distinct from the prevalence of a particular pathogen, for instance.
Although pooled effect estimates offer several advantages over composite outcomes as a summary of multiple related outcomes, both approaches are sensitive to the selection of the set of outcomes to be analyzed. Including many pathogens with little chance of being affected by the intervention in our analysis will bias the summary effect towards the null, in an inversion of the familiar meta-analysis challenge of publication bias against reporting null results leading to inflated meta-analytic effect estimates [100]. Conversely, we wish to avoid undue influence— and avoid limiting opportunities for discovery and learning—from too narrowly defining the outcome set to include only those targets expected to provide a desired result [101]. Striking the appropriate balance between these considerations will likely remain an exercise in scientific judgement, predicated on the careful consideration of subject matter knowledge and validated to the extent possible through sensitivity and bias analyses [102–104].
For our present study, we choose to exclude enteric viruses from the primary outcome set because we believe the possibility of direct contact transmission limits the potential for the sanitation intervention to prevent exposure [105]. A combined WASH intervention reduced several enteric viruses (and no other pathogens) in younger children in rural Bangladesh [10]; because that intervention included hand hygiene and other components that could feasibly prevent direct transmission in addition to sanitation infrastructure, we argue it remains defensible to exclude enteric viruses from the primary analysis of our onsite sanitation intervention. However, we will assess the individual and pooled effects on enteric viruses and all measured enteric pathogens collectively in secondary analyses. Otherwise, we have chosen to be relatively expansive with the outcome set for the primary pooled intervention effect analysis, including all bacteria, protozoa, and STHs that are being measured in child stool. Nevertheless, we have reservations about including a particular parasite, Giardia spp., in the outcome set, as discussed below.
Should Giardia have been excluded as a trial outcome?
Giardia is one of the most commonly detected pathogens in child stool in low- and middle-income countries, including in the original MapSan study cohort and elsewhere in southern Mozambique and it was included as one of a panel of enteric pathogens in the original MapSan trial [12, 34, 61, 106]. The high prevalence of Giardia, which has been found to increase rapidly with age, demonstrates a failure to prevent exposure [107]. However, persistent shedding of Giardia has also been observed in endemic areas, to an extent that it has been suggested to function as something of a gut commensal in such settings, even potentially protecting against diarrheal illness [108]. Whether arising from persistent infection or rapid re-infection [106, 109], this extended shedding suggests that detection of Giardia may not serve as a meaningful indicator of recent exposure in the context of household sanitation infrastructure. Although household finished flooring was associated with reduced Giardia prevalence in both Bangladesh and Kenya and a combined WASH intervention also reduced Giardia prevalence among older children in Bangladesh, onsite sanitation interventions were not associated with Giardia in rural Bangladesh or Zimbabwe, nor in urban Mozambique [10, 11, 34, 110]. Recognizing that its unique and insufficiently understood epidemiology may limit interpretation, we will repeat the primary outcome analysis with Giardia excluded from the outcome set, using the remaining 21 non-viral pathogens to estimate the pooled intervention effect on pathogen prevalence.
Limitations
As an observational, cross-sectional evaluation of an existing intervention, this study faces a number of limitations that may impact the validity and generalizability of our findings. By design, all of the study participants will have been born after the intervention was implemented and it will not be possible to account for pre-intervention characteristics. However, the opportunity to evaluate intervention impacts among individuals across a range of ages who have been exposed to intervention for their entire lives (and potentially also in utero for many, although not an eligibility criterion nor assessed) is sufficiently valuable to warrant the approach.
Relatedly, there is potential for confounding bias in our estimates of the intervention impacts on child health, particularly concerning socioeconomic factors that may be associated with increased pathogen exposures. While non-randomized, the criteria for intervention and control compounds differed only by engineering considerations that we believe to be independent of the outcome. However, in the years since the intervention was implemented, the presence of the intervention itself may have influenced the desirability of the intervention sites and thus the socioeconomic status (SES) of their residents, particularly in light of the high population turnover previously observed after only 1 – 2 years. All analyses will be adjusted for a location-specific wealth index to account for potential differential SES between treatment arms [93], in addition to other pre-specified covariates associated with enteric pathogen exposure that may plausibly be related to treatment status, but the possibility of unmeasured and insufficiently controlled confounding remains.
Because childhood diarrhea is a leading cause of child mortality, selection bias arising from survivor effects may be present in our sample, particularly among older age groups [111]. However, loss to follow-up due to mortality was exceptionally rare in the previous assessment— far less common than emigration, which was not previously differential by treatment arm [34]. Our use of stool-based enteric pathogen detection as an objective primary outcome mitigates the potential for measurement bias [36] and we are assessing multiple negative control outcomes to account for potential response bias in our secondary caregiver-reported outcomes [72]. As a long-term evaluation of an existing intervention, the potential for exposure measurement error is low—interventions that have degraded substantially may shift results towards the null, but in so doing would represent a valid assessment of intervention sustainability [112]. The potential for exposure misclassification among controls is higher, in that they may have independently upgraded to sanitation infrastructure comparable to the intervention. We are actively monitoring this possibility at all study sites, have pre-specified criteria for identifying sites achieving such conditions, and will conduct sensitivity analyses with these sites excluded, considered as controls, and considered as equivalent to the intervention to characterize the impact of any such potential exposure misclassification among controls.
As discussed previously, a key challenge is identifying an appropriate set of outcomes against which to evaluate the intervention; in this regard, we preferred inclusivity given the pathogens likely to be observed in our study setting, enteric viruses notwithstanding, at the risk of unduly shifting our eventual results towards the null. This challenge persists, regardless of study design, so long as the study aims include assessing the effect of some condition or intervention on exposure to, or infection by, multiple pathogens that share a common route of transmission [103].
Contribution
Although observational, this study is unique in evaluating sanitation intervention effects at least five years after the intervention was implemented and up to five years after participants were borne into the intervention conditions. Previous studies have focused on WASH intervention impacts up to 2 – 3 years after delivery [13, 31–34], which may be insufficient time to realize impacts [42]. Furthermore, most previous studies have been conducted in rural settings, while we are investigating the long-term effects of an urban sanitation intervention that is broadly representative of the types of infrastructure improvements likely to be available in the rapidly growing urban informal settlements in coming years. Finally, we use stool-based detection of multiple enteric pathogens as an objective outcome and propose a novel pooled estimate of the treatment effect across a pre-specified outcome set to summarize the overall impact of the intervention on enteric pathogen exposure as the primary trial outcome.
Data Availability
This is the protocol for an ongoing study. Upon publication of study results, the underlying individual participant data will be fully de-identified according to the Safe Harbor method and made freely available in a permanent online repository (https://osf.io/e7pvk/) in accordance with the funder’s open data policies.
Declarations
Ethics approval and consent to participate
This study was approved by Comité Nacional de Bioética para a Saúde, Ministério da Saúde de Moçambique (FWA#: 00003139, IRB00002657, 326/CNBS/21; approved: 15 June 2021) and University of North Carolina at Chapel Hill Ethics Committee (IRB#: 21-1119; approved 19 August 2021).
Consent for publication
Not applicable.
Availability of data and materials
Upon publication of study results, the underlying individual participant data will be fully de-identified according to the Safe Harbor method and made freely available in a permanent online repository (https://osf.io/e7pvk/) in accordance with the funder’s open data policies.
Competing interests
The authors declare that they have no competing interests.
Funding
This study is funded by the Bill & Melinda Gates Foundation (OPP1137224) with additional support from a National Institute of Environmental Health Sciences training grant (T32ES007018). The funders had no role in the study design; data collection, analysis, and interpretation; or decision to publish.
Authors’ contributions
JB, EV, OC, RN, and JK designed the study, secured funding, and provided ongoing supervision. VM and DC drafted the initial study protocol and obtained ethical approvals. DH developed the statistical analysis plan and prepared the prospective trial registration and protocol manuscript. EK, GR, EM, JR, VA, AL, and YL developed and implemented the study procedures, which were overseen by VC, VM, MC, and NI. All authors contributed to and approved the final manuscript.
Acknowledgements
We thank all the participants, their families, and neighbors for graciously welcoming us into their communities, and our implementing partner, Water and Sanitation for the Urban Poor, for their continued support. We also gratefully acknowledge the hard work of the CISPOC survey team, including Jorge Binguane, Noémia Come, Anelsa Dunhe, Alice Fumo, Antônio Johane, Evelin Matos, Eloisa Mula, Mariza Rachid, and Líria Sambo; and of Filipe Fazenda, Claúdia Machume, and Alfredo Muchanga in the CISPOC laboratory.
List of Abbreviations
- Abbreviation
- Description
- CISPOC
- Centro de Investigação e Treino em Saúde da Polana Caniço (Polana Caniço Health Research and Training Center)
- CSB
- communal sanitation block
- EAEC
- enteroaggregative E. coli
- EED
- environmental enteric dysfunction
- EIEC
- enteroinvasive E. coli
- EMM
- effect measure modification
- EPEC
- enteropathogenic E. coli
- ETEC
- enterotoxigenic E. coli
- FST
- fecal source tracking
- GEE
- generalized estimating equations
- GEMS
- Global Enteric Multicenter Study
- HAZ
- height-for-age z-score
- ICC
- intra-class correlation coefficient
- INS
- Instituto Nacional de Saúde (National Institute of Health, Republic of Mozambique)
- LMIC
- low- and middle-income countries
- LSHTM
- London School of Hygiene and Tropical Medicine
- MapSan
- Maputo Sanitation trial
- MDES
- minimum detectable effect size
- NGO
- nongovernmental organization
- PD
- prevalence difference
- POR
- prevalence odds ratio
- PR
- prevalence ratio
- RT-qPCR
- Reverse-transcription quantitative [real-time] polymerase chain reaction
- SES
- socioeconomic status
- SL
- shared latrine
- STEC
- Shiga toxin-producing E. coli
- STH
- soil transmitted helminth
- TAC
- TaqMan Array Card
- UNC
- University of North Carolina at Chapel Hill
- WASH
- water, sanitation, and hygiene
- WAZ
- weight-for-age z-score
- WHO
- World Health Organization
- WHZ
- weight-for-height z-score
- WSP
- Water and Sanitation Programme
- WSUP
- Water and Sanitation for the Urban Poor
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.
- 17.↵
- 18.↵
- 19.
- 20.
- 21.↵
- 22.↵
- 23.↵
- 24.
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.
- 31.↵
- 32.
- 33.
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.
- 79.
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵