Exposure Contrasts of Pregnant Women during the Household Air Pollution Intervention Network Randomized Controlled Trial

Background: Exposure to PM2.5 arising from solid fuel combustion is estimated to result in ∼2.3 million premature deaths and 91 million lost disability-adjusted life years annually. Interventions attempting to mitigate this burden have had limited success in reducing exposures to levels thought to provide substantive health benefits. Objectives: This paper reports exposure reductions achieved by a liquified petroleum gas (LPG) stove and fuel intervention for pregnant mothers in the Household Air Pollution Intervention Network (HAPIN) randomized controlled trial. Methods: The HAPIN trial included 3,195 households primarily using biomass for cooking in Guatemala, India, Peru, and Rwanda. Twenty-four-hour exposures to PM2.5, carbon monoxide (CO), and black carbon (BC) were measured for pregnant women once before randomization into control (n=1,605) and LPG (n=1,590) arms and twice thereafter (aligned with trimester). Changes in exposure were estimated by directly comparing exposures between intervention and control arms and by using linear mixed-effect models to estimate the impact of the intervention on exposure levels. Results: Median postrandomization exposures of particulate matter (PM) with aerodynamic diameter ≤2.5μm (PM2.5) in the intervention arm were lower by 66% at the first (71.5 vs. 24.1 μg/m3), and second follow-up visits (69.5 vs. 23.7 μg/m3) compared to controls. BC exposures were lower in the intervention arm by 72% (9.7 vs. 2.7 μg/m3) and 70% (9.6 vs. 2.8 μg/m3) at the first and second follow-up visits, respectively, and carbon monoxide exposure was 82% lower at both visits (1.1 vs. 0.2 ppm) in comparison with controls. Exposure reductions were consistent over time and were similar across research locations. Discussion: Postintervention PM2.5 exposures in the intervention arm were at the lower end of what has been reported for LPG and other clean fuel interventions, with 69% of PM2.5 samples falling below the World Health Organization Annual Interim Target 1 of 35 μg/m3. This study indicates that an LPG intervention can reduce PM2.5 exposures to levels at or below WHO targets. https://doi.org/10.1289/EHP10295


Introduction
Household air pollution (HAP) from the incomplete combustion of solid fuels-including wood, dung, and crop residues-results in exposure to particulate matter (PM) with an aerodynamic diameter ≤2:5 lm (PM 2:5 ), carbon monoxide (CO), black carbon (BC), and other emissions that are hazardous to human health. 1 In 2019, exposure to PM 2:5 arising from solid fuel combustion was estimated to result in approximately 2.3 million premature deaths and 91 million lost disability-adjusted life years annually. 2 Approximately half of the world's population relies on these fuels for cooking, 3 predominantly in low-and middleincome countries.
Many studies have documented associations between HAP exposure and increased risk for adverse health end points, including cardiopulmonary outcomes, cancer, and pneumonia. 4 Most intervention studies to reduce HAP exposure and improve health have sought to replace traditional cookstoves with more fuel efficient stoves, yet these still typically required the continued burning of solid fuels such as wood. 5,6 Trials and other intervention studies involving wood-burning cookstoves have largely failed to sufficiently measure 7,8 or reduce exposure to levels expected to yield meaningful health benefits, [9][10][11][12] such as the World Health Organization (WHO) annual PM 2:5 Interim Target 1 (WHO-IT1) guideline of 35 lg=m 3 . 13 A few recent trials in Nigeria, 14 Nepal, 15 Ghana, 16 and Peru 17 have included cleaner cooking interventions, such as LPG or ethanol stoves. However, these studies had small sample sizes 17 or insufficiently reduced exposures due to the continued use of traditional stoves and elevated ambient air pollution concentrations, resulting in attenuated exposure reductions. [14][15][16] Better characterization of multipollutant exposure contrasts achieved by clean fuel interventions, especially those with higher likelihoods of achieving exposure reductions, in multiple countries is important for understanding their potential to improve health as well as characterizing exposure-response relationships.
As part of the multicountry Household Air Pollution Intervention Network (HAPIN) study, we undertook extensive personal air pollution exposure assessment at baseline (prior to randomization) and at multiple time points during and after pregnancy. Here we report the impact of HAPIN's LPG intervention on personal PM 2:5 , CO, and BC exposures among pregnant participants.

Study Setting and HAPIN Trial Overview
The HAPIN study is a randomized controlled liquefied petroleum gas (LPG) fuel and stove intervention trial underway in four international research centers (IRCs) in Guatemala, India, Peru, and Rwanda. The study approach and site descriptions have been described in detail previously. [18][19][20][21] Briefly, IRCs were located in relatively low-density rural areas that had low background air pollution concentrations according to 24-to 48-h ambient outdoor PM 2:5 measurements, as well as minimal observation of other pollution sources. 17,22,23 Study locations were selected in areas where most households typically used traditional biomass stoves to fulfill their cooking and energy needs. Within each IRC, specific study sites were chosen during a 12-month period of formative research. Numerous factors drove site selection, including population density, fuel use, household characteristics, socioeconomic status, and other potential sources of ambient and/ or household air pollution that may impact the potential to understand and describe exposure reductions potentially attributable to the HAPIN intervention package.
In India, sites were chosen in Villupuram and Nagapattinam districts in the state of Tamil Nadu, where traditional mud and clay stoves fueled with wood were used predominantly indoors. In Guatemala, participants were recruited in Jalapa municipality, where indoor wood fuel use in chimney stoves and open fires was common. In Peru, activities occurred in the Department of Puno, where households burned wood and dung in built-in open or chimney stoves for cooking. In Rwanda, households that use three-stone fires or simple open stoves (ronderezas)-predominantly indoors with wood or portable charcoal-burning stoves (imbabura)-were recruited from Kayonza District in the Eastern Province.
Using a common protocol across each site, the trial aimed to assess the health effects of the intervention among pregnant women (n = 3,200), their resulting newborn children (n = 3,200), and nonpregnant adult women living in the same household (n = 444), all split evenly between intervention and control arms. This paper presents exposure results from pregnant women; these represent the first set of measurements available for analysis and are aligned with some of the HAPIN trial's main health outcomes, which include birth weight, incidence of severe pediatric pneumonia, stunted growth in the children, and blood pressure in the nonpregnant adult women. Sample sizes were informed by power calculations for minimal detectable differences in mean birth weight and blood pressure, and mean relative risk for stunting and pneumonia, which are fully described in Clasen et al. 18 The study protocol was reviewed and approved by institutional review boards (IRBs) and Ethics Committees at Emory

Recruitment and Intervention Design
Recruitment. Candidate pregnant women were identified and enrolled through local partnerships with health clinics and community health workers. To be eligible to participate, women were required to meet the following criteria: ages 18-35 y old, 9 to <20 wk gestation with a viable singleton pregnancy (confirmed by ultrasound), primarily used biomass fuel for cooking, and agreed to participate via informed consent. Exclusion criteria included tobacco use, plans to move outside the study area, and plans to switch to clean fuels.
Intervention design. Following a baseline survey, participant households were randomly assigned on a 1:1 basis to receive an LPG stove, continuous fuel delivery, and regular behavioral messaging vs. continued use of a biomass-burning stove. Additional stratified randomization was used in India and Peru to ensure balance between distinct geographical regions within the respective study areas. In Rwanda and Guatemala, the study areas were deemed homogenous and no further stratification was necessary. Intervention components were informed by formative research and described in detail previously. 21,22 The stoves and fuel cylinders were procured separately at each IRC following formative research on local cooking practices and equipment availability. Stove design varied by site, but all stoves had at least two burners and included additional components for preparation of traditional foods (e.g., a flat griddle for cooking tortillas in Guatemala and a roasting grill in Rwanda). Intervention households received the stove and continuous fuel supply at no cost throughout follow-up. In Guatemala, Peru, and Rwanda, field staff completed the initial stove installation and delivered LPG cylinders to participating homes. In India, per local regulations, a contracted local LPG distribution company conducted the stove installation and LPG cylinder deliveries.
Behavioral support included a pledge by intervention homes to use the LPG stove for all cooking throughout the trial, safety training, tailored messaging to encourage the exclusive use of LPG and discourage use of traditional stoves, and behavioral reinforcements on detection of traditional stove use.

Air Pollutant Sampling Instrumentation
We used the RTI Enhanced Children's MicroPEM (ECM, RTI International) to measure exposure to PM 2:5 . 24 The ECM uses a 2:5-lm size-cut impactor at a flow rate of 0:3 L=min and measures continuous PM 2:5 concentrations using a nephelometer. It simultaneously collects integrated gravimetric samples on 15 mm polytetrafluoroethylene filters (Measurement Technology Laboratories). The ECM is light in weight (approximately 150 g), small (2:5 × 6:5 × 12:5 cm), and nearly silent during use. It logs temperature, relative humidity, pressure drop across the filter, and triaxial accelerometry. BC was estimated during postsampling processing via transmissometry (see below).
We logged 1-minute CO concentrations using the Lascar EL-USB-300 (Lascar Electronics), which is the size of a large pen (125 × 26:4 × 26:4 mm, 42 g), runs on one-half AA batteries, and has a sensing range between 0 and 300 ppm. The Lascar CO device has been used extensively in HAP assessment. 16,25,26 Sampling Strategy Personal exposures of the pregnant women to PM 2:5 , BC, and CO reported in this manuscript were based on 24-h measurements at three visits at each HAPIN IRC; exposure results from other participants will be reported separately. Baseline measurements were made at >9 and <20 wk of gestation, prior to randomization. Follow-up, postrandomization measurements were made at 24-28 wk of gestation and 32-36 wk of gestation. At each monitoring period, pregnant participants were asked to wear a customized garment 19 with instrumentation situated in the breathing zone 27,28 and to keep the instrumentation nearby (within 1-2 m), but not on when conducting activities that may damage the equipment through excessive impact or exposure to water (e.g., sleeping, bathing, heavy washing, or work that soaks the participant).
Measurements. Determining PM 2:5 mass concentrations. At each visit, 24-h gravimetric filter-based and concurrent nephelometric samples were collected for each participant. Changes in filter mass pre-and postsampling were assessed using 1 lg resolution microbalances (Sartorius Cubis, MSA6.6s-000-DF) at the University of Georgia (filters for Guatemala, Rwanda, and Peru) and at the Sri Ramachandra Institute for Higher Education and Research (for India).
We assessed gravimetric data validity using a three-stage process: a) field technicians evaluated pre-and postsample flow rates with a primary flowmeter at the field office, enabling implementation of validation criteria to flag and remove samples outside of expected ranges; b) laboratory technicians invalidated samples with damaged filters; c) data analysts removed data that did not meet criteria for sample duration (24 ± 4 h), flow rate (300 ± 100 mL=min, measured by the internal flow sensor), and inlet pressure (95th percentile <5 inches H 2 O). Additional details are presented in the Supplemental Material section titled "QA/QC: PM 2:5 Sampling" and Table S1 and  Table S2.
For cases in which the gravimetric sample was invalidated (e.g., due to a missing or damaged filter or flow faults), the nephelometer data from the instrument were used to estimate personal exposure, requiring additional data validity checks. We evaluated the relationship between the nephelometer and gravimetric samples for each ECM monitor (n = 431) used in the study, deeming acceptable performance as follows: R 2 values >0:65; ≥3 available pairs of valid filter and nephelometric samples; and slopes between 0.5 and 2.5. For the ECM samplers that met these criteria (57.8%), regression models were applied to the adjusted 24-h average nephelometer values for those samples with missing or invalid gravimetric samples, resulting in instrument-specific nephelometric PM 2:5 concentrations normalized to field-based filter samples. Additional details are available in Supplemental Material, "QA/QC: PM 2:5 Sampling." Quality control and assurance. Field blanks were collected at a rate of 4 per 100 sample filters. In total 393 field blanks were collected (98 samples on average per IRC, standard deviation 38). Median blank corrections were performed by IRC. The limit of detection (LOD) was calculated separately for each IRC as three times the standard deviation of the blank mass depositions, after removing five blanks that were determined to have mislabeling or storage errors (e.g., placed in the wrong storage cassette). Sample depositions below the LOD were replaced with LOD=ð2 0:5 Þ. 29 Duplicate ECMs were deployed on a subset of samples (n = 253) to assess between-monitor performance (Table S2 and Figure S1).
Wearing compliance. Compliance, as measured by the ECM's accelerometer, was not used for data exclusion due to differences in wearing patterns by country and the difficulty in discerning whether stillness of the monitor was truly indicative of noncompliance or whether the pregnant women were actually adjacent to the ECMs. Additional details regarding compliance measurement, how it was calculated, and results (e.g., summary statistics and distributions) are presented in the Supplemental Material, "QA/QC: PM 2:5 Sampling Compliance" section and Table S3 and Figure S2.
BC. BC concentrations were estimated for PM 2:5 filter samples using SootScan Model OT-21 Optical Transmissometers (Magee Scientific), either at the University of Georgia (UGA, Athens, Georgia, USA) for samples collected in Guatemala, Peru, and Rwanda or at Sri Ramachandra Institute for Higher Education and Research (SRIHER, Chennai, India) for samples collected in India. BC depositions were estimated per Garland et al. 30 using the BC attenuation cross-section values for similar Teflon filters (r ATN = 13:7 lg=cm 2 ) collected from similar source types. Most filters collected for the Guatemala, Peru, and Rwanda samples used both a pre-and postscan [2,672 (99.2%), 2,232 (97.1%), and 2,181 (98.9%), respectively], whereas India had 2,443 (100%) without prescans due to equipment unavailability. For India, the average of blank filter postscan values was substituted for prescan values. LOD was calculated as it was for gravimetric mass (three times the blank standard deviation). Values below the LOD were replaced with LOD=ð2 0:5 Þ. Additional details available in Supplemental Material, "QA/QC: Black Carbon," Table S4, and Figure S3.
CO. CO data quality assurance procedures included calibrations with zero air and CO span gas (ranging between 40 and 80 ppm by IRC); automated, server-based quality assurance checks at regular intervals; and a visual rating system similar to that applied in the Ghana Randomized Air Pollution and Health Study. 16 CO loggers were to be calibrated every 1-3 months, as described in Johnson et al. 19 CO monitors were calibrated using the temporally closest calibration coefficient. Data were then checked for sampling duration (24 h ± 4 h) and visually rated to remove files (3.4%), which displayed response artifacts. Duplicate monitors were deployed for a subset of samples to assess monitor performance. Additional details available in Supplemental Material, "QA/QC: CO," Tables S5-S6, and Figure S4.

Statistical Analyses
All analyses were performed in R (versions 3.6 and 4.0; R Foundation for Statistical Computing). We first calculated pollutantspecific descriptive statistics for valid measurements in control and intervention groups by study phase (baseline vs. postintervention rounds) and IRC. We evaluated Spearman correlations between measurements for the same pollutants collected at baseline and postintervention and correlations between pollutants at each measurement point. These were evaluated overall and stratified by assigned stove/fuel type. Differences in pollutant levels between control and intervention groups by period (i.e., at baseline, postintervention visits 1 and 2) were evaluated using nonparametric tests (Wilcoxon Rank Sum, Kruskal-Wallis, and Dunn's tests). We also evaluated the proportion of samples that were less than or equal to WHO guidelines and targets. For PM 2:5 , we compare our measurements with the Annual Interim Target 1 (WHO-IT1) guideline value of 35 lg=m 3 . 31 We focus on this interim target value because it represents a potentially attainable milestone on the pathway toward achieving the current and ambitious final guideline value of 5 lg=m 3 . For carbon monoxide, we compare our measurements with the annual WHO 24-h guideline value of 4 mg=m 3 (∼ 3:5 ppm) because no annual guideline or target is provided. 13 Following approaches described in McCracken et al. 32 and Chillrud et al. 16 we used statistical methods that leverage our study design and repeat measurements to assess the impact of intervening with LPG on exposure to PM 2:5 , CO, and BC during gestation. Given the right-skewed distribution of measured data, pollutant concentrations were natural log-transformed prior to regression analyses. We used linear mixed-effects models to assess the impact of the intervention on log-transformed personal exposures and included a random intercept to account for correlation among repeated measurements made on the same participants (i.e., at baseline and postintervention visits 1 and 2). We also evaluated nontransformed models to estimate the absolute change in exposures. Finally, we used mixed-effect models with no covariates and a random effect for participant ID to partition variance and estimate the intraclass correlation coefficient (ICC), enabling evaluation of within and between participant variability.
We fit four models, offering distinct comparisons for the association between the LPG intervention and exposures. A summary of models including equations, data used, and rationale is presented in Table S7. Model 1 estimates the effect of the intervention on pollutant exposures (i.e., PM 2:5 , CO, and BC) by comparing exposures in the treatment arms during the postintervention period ("between groups"). The main parameter of interest from this model is the fixed effect for the treatment arm. Model 2 estimates the difference in exposure between postintervention and baseline periods ("before and after") separately for each treatment arm. In these models, the parameter of interest is the difference between baseline and postbaseline measurements for the intervention arm, and the same difference for the control arm. Model 3 estimates changes in exposure in the intervention arm, pre-vs. postintervention, relative to any changes experienced in the control arm over the same period ("comparison-of-changes"). In this model, the parameter of interest is the "treatment arm × period" interaction term, where period is either pre-or postintervention, and which controls for potential differences at baseline. Model 4 estimates comparison of changes by study visit (the same as model 3, treating each postintervention visit as its own time point). The parameter of interest is the "treatment arm × visit" interaction term. This approach enables evaluation of the stability of changes in exposure over time. Parameters of interest were exponentiated, subtracted from 1, and multiplied by 100 to estimate the percent reduction in personal exposure due to the intervention. Models were run for the entire data set and separately for each IRC.
Additionally, as sensitivity analyses, we estimated the distribution of missing PM 2:5 data throughout the study, and summarized characteristics of participants by measurement round and study arm for those with and without missing data. For participants with missing data, we used predictive mean-matching implemented in the mice package (multivariate imputation by chained equations) 33 in R to impute missing values. We compared the observed data set to 10 data sets with imputed values, repeated our modeling approach for each of these 10 data sets, and pooled model estimates following Rubin's Rule. 34 We also performed additional analyses using generalized estimating equations (binomial, logit link function) to evaluate whether maternal characteristics influenced the probability of an observation being missing. Table 1 shows household and participant characteristics for control and intervention arms at the four IRCs. Additional covariates are in Tables S8 and S9, and a comparison of a subset of characteristics for participants with and without missing exposure data are in Table S10. Balance was evident as expected between the arms within each IRC for the age of the pregnant women, as well as educational attainment and occupational status. Households typically cooked indoors. There was heterogeneity in fuel types between countries, with wood dominant in Guatemala and India, dung dominant in Peru, and wood and charcoal in use in Rwanda. Participants with and without measurements were largely similar across categories, and differences in characteristics between arms were not strong predictors of missingness (Table S11). For participants' missing data, we found no differences in maternal age, body mass index, food insecurity, or diet diversity, but observed differences for gestational age in weeks (approximately 7%) and for education ( ∼ 20%) (Supplemental Material, "Predicting Missingness" and Table S11).

Exposure Measurements, Data Completeness, Compliance, and Quality Assurance and Control
Across the baseline visit and first two post-intervention visits, HAPIN field staff made more than 9,000 exposure monitoring visits for 3,195 pregnant women (1,605 in the control arm and 1,590 in the intervention arm; country-specific values are in Table 1). Eighty-two percent of the pregnant women had a valid baseline PM 2:5 sample and at least one valid PM 2:5 postintervention sample. We observed a relatively high ICC of 0.61 for PM 2:5 measurements, suggesting that missing data for a given participant is likely similar to observed data for that participant. Approximately 14% of invalid gravimetric samples were replaced with ECM-specific, gravimetric-adjusted nephelometer values. For missing PM 2:5 values, we imputed values by study arm. Imputed values were similar to observed values (Tables S12 and S13; Figure S5).
Mean percentage of daytime hours when instrumentation motion was detected ranged from 31% in India to 76% in Rwanda (see Table S3 and Figure S2). No samples were excluded due to low percentages of time that motion was detected because participants were asked to keep the instrumentation nearby when they were unable to wear the sampling devices.
For CO, 84% had a valid baseline sample and at least one valid postintervention sample. For BC, 73% had a valid baseline measurement and at least one valid postintervention measurement. The percentage of samples successfully collected-by treatment arm, measurement visit, and IRC-is displayed in  Figure S4; Table S6).

Exposure Summary
We summarized personal exposures to PM 2:5 , BC, and CO exposures for pregnant women by IRC and visit in Table 2. Exposure distributions are displayed graphically in Figure 1 (tabular data for this plot are in Table 2; IRC-specific plots are in Figure S6). HAPIN-wide, there was no significant difference between Table 1. Household and maternal characteristics at HAPIN baseline, by study site and intervention arm. During the baseline period, ∼ 17% of measurements in both the control and intervention arms had PM 2:5 exposures less than or equal to the annual WHO-IT1. During the postintervention period, 23% of control exposures were at or below the annual WHO-IT1; 69% of intervention exposures fell below the target. In addition, 83% and 80% of 24-h exposures to CO in the control and intervention arms, respectively, were below the WHO annual guideline value for CO (3:5 ppm) at baseline. Postintervention, 84% of control exposures were below the guideline value, whereas 96% of intervention exposures were less than the guideline.
Exposures over time. Trial-wide and within IRCs, we observed changes in PM 2:5 exposures between baseline and postintervention rounds. The magnitude, consistency, and significance of these changes varied by study site and arm.
Study-wide. We plotted personal exposures to PM 2:5 over time after randomization trial-wide ( Figure 2; Table S14), highlighting the relative overlap in exposures during the baseline period and the distinct separation of exposures between control and intervention groups after intervention. Baseline exposures were not significantly different (p = ∼ 0:5) between control and intervention households. Although the magnitude of the exposures and the exposure contrast vary between sites post randomization, we note the relative stability of exposures across control and intervention arms. Additional details by IRC and pollutant are in the Supplemental Material, "IRC and pollutant-specific findings").
Intervention households. There was a significant and large decrease in pollutant exposures between baseline and postintervention measurements. For PM 2:5 , mean exposures decreased from 120:1 lg=m 3 at baseline to 33:8 lg=m 3 and 35:8 lg=m 3 at postintervention visits 1 and 2. For BC, the average baseline exposure was 12:6 lg=m 3 , and postintervention visit 1 and 2 exposures were 4:0 lg=m 3 and 4:3 lg=m 3 , respectively. For CO, the baseline average was 2:7 ppm; at both follow-up visits, and the average exposure was 0:7 ppm.
Additional information on pollutant levels by round, country, and study arm is in Table 2. Statistical tests comparing values between study measurement rounds are in Table S15. Table 1.  Correlations between measurement rounds. Correlations (Spearman's q) between measurement rounds were moderate. For PM 2:5 in the control arm, correlations between baseline and postintervention round 1, baseline and postintervention round 2, and postintervention rounds 1 and 2 were 0.42, 0.40, and 0.51, respectively. Correlations for BC in the control arm were weaker (0.29, 0.33, and 0.47, respectively), as were correlations for CO (0.29, 0.26, and 0.29, respectively).
Among intervention households, correlations between baseline and postintervention round 1, baseline and postintervention round 2, and postintervention visits 1 and 2 were 0.21, 0.18, and 0.39, respectively. BC values followed a similar trend (0.18, 0.11, and 0.56 for the same comparisons) as did CO (0.14, 0.12, and 0.30, respectively). Weak correlations between baseline and postintervention rounds among intervention households were expected, as the intervention was placed and in use after baseline but prior to postintervention measurements.
Correlations between pollutants. The relationship between PM 2:5 and CO among biomass-using households (intervention arm at baseline; control arm at baseline and postintervention rounds) was moderate (Spearman q = ∼ 0:5), though much stronger than in LPG using households (Spearman q = ∼ 0:06; Supplemental Material, "Relationships between pollutants" and Figure S7 and Figure S8). Figure S7, Panel B, shows clear and consistent correlations between PM 2:5 and CO for sample groups using biomass, though the relationship varies by IRC. For traditional stoves, the Spearman q values were 0.70, 0.62, 0.54, and 0.25 for Guatemala, India, Peru, and Rwanda, respectively. The overall, HAPIN-wide Spearman q was 0.51. Among postintervention exposure samples of LPG users, relationships were much weaker (Spearman q = 0.05-0.16), which was expected given the lack of a dominant biomass smoke source in the homes. The relationship between PM 2:5 and BC was stronger. There was some heterogeneity between countries (for biomass users, between 0.68 and 0.87; for LPG users, between 0.46 and 0.73). As BC is a constituent of PM 2:5 , the stronger relationships with PM 2:5 compared to CO is not surprising.

Modeling Results
The effect of the LPG stove and fuel intervention on personal exposures. All models of the impact of the HAPIN LPG fuel and stove intervention indicated significant reductions in all measured pollutants. Figure 3 reports results from the between groups, before-and-after, and comparison-of-changes modeling approaches for PM 2:5 . Estimates of the percent reduction in PM 2:5 exposure due to the intervention were similar across models: 61% [95% confidence interval (CI): 59%, 63%] for the "between groups" approach; 68% (95% CI: 66%, 69%) for the before-and-after approach; and 62% (95% CI: 59%, 64%) for the comparison-ofchanges approach (Table 3). Results were similar for BC (Table 3; Figure S9) and more pronounced for CO (Table 3; Figure S10). Results for imputed PM 2:5 ( Figure S11 and Table S13) were similar.
indicating little to no change in intervention effectiveness over time. Models are presented separately for each IRC in the SM (Tables S16-S18; Figures S12-S14); trends are consistent between IRCs, though the magnitude of reductions vary by IRC.

Exposure Comparisons with Previous Studies
The HAPIN intervention of a free LPG stove and fuel supply, along with behavior change efforts, resulted in substantial and significant personal exposure reductions for pregnant women receiving the intervention when compared to the control arm for all pollutants and in all countries (IRC-and pollutant-specific findings are in the supporting information). Median overall PM 2:5 postintervention exposure measurements, approximately 3 months apart, varied by 2 lg=m 3 or less ( Figure 1; Table 2), suggesting the intervention had a stable effect through pregnancy. In total, these findings indicate consistent exposure reductions to near or below the annual WHO-IT1 target value of 35 lg=m 3 for PM 2:5 , with 69% of all 24-h postintervention PM 2:5 samples less than the target. CO exposures were also reduced, although even control group participants were largely below the WHO 24-h guideline of 3:5 ppm, with overall median exposures for the control and intervention groups ranging from 0.2 to 1:1 ppm.  Table 2. Note: BC, black carbon; CO, carbon monoxide; PM, particulate matter; WHO, World Health Organization.
Although reporting of different summary metrics makes uniform comparisons less straightforward, the exposure concentrations in the intervention arm were at the lower end of what has been reported for LPG or other clean fuel interventions. Table S19 in the Supplemental Material provides a summary of the comparison studies' exposure estimates for the available average metrics. A systematic review by Pope et al. 6 reported a pooled mean of 58 lg=m 3 for the six LPG studies included in the analysis, in comparison with HAPIN-wide means of 33.8, and 35:8 lg=m 3 for the first and second postintervention visits, respectively (see Table 2). CO exposures were similar [0:7 ppm for both postintervention visits (see Table 2)] to the single LPG study reported in the Pope et al. 6 Table 2; data for the postrandomization period can be found in Table S14 in the Supplemental Material. Note: IQR, interquartile range; IRC, international research center. Figure 3. Estimated impacts of the HAPIN LPG intervention on PM 2:5 exposure. All linear mixed-effects models had log-transformed PM 2:5 as the dependent variable. Whiskers are 95% confidence intervals. The first panel ("Before and After") uses data from both the control and intervention arms and compares the intervention period to the baseline period. The second panel ("Between Groups") uses only data from the intervention period and contrasts the intervention arm with the control arm. The third panel ("Comparison-of-Changes") uses all data from both study arms and both study periods; the model term of interest is the interaction between study arm and period, after controlling for each variable separately in the model. The "Overall" points consider an average postintervention exposure; the Visit-specific points consider each postrandomization visit separately. Numeric values corresponding to this figure are found in Table 3. Note: HAPIN, Household Air Pollution Intervention Network; LPG, liquefied petroleum gas.
A few recent HAP studies are of special interest given their scope and sample size, even without the same focus HAPIN placed on near-exclusive use of LPG. The Prospective Urban and Rural Epidemiological (PURE) study conducted observational PM 2:5 and BC exposure measurements across 120 different communities ( ∼ 2,500 homes) in Bangladesh, Chile, China, Colombia, India, Pakistan, Tanzania, and Zimbabwe. 35 The reported PM 2:5 geometric means for women's exposure among wood users were 89, 39, and 153 lg=m 3 in India, South America, and Africa, respectively; among LPG users, exposures were 70, 32, and 146 lg=m 3 , respectively. 35 This study also offers the best comparison for BC exposures (as elemental carbon), with women's geometric mean exposures in primarily wood-using homes reported at 2:5-8:8 lg=m 3 and 2:0-7:0: lg=m 3 for the same user/region groupings. We estimated median BC exposures at 4-12 lg=m 3 for all wood-using households (control and intervention groups at baseline and controls post intervention) and 2-10 lg=m 3 post intervention. The trend of higher PM 2:5 exposures in Africa and lower exposures in Latin America is similar to what was observed in HAPIN, although the postintervention exposures in HAPIN were substantially lower than those reported in PURE, assuming that the geometric means and medians are estimating similar central tendencies.
Similarly, our postintervention PM 2:5 exposures in the intervention arm were lower than those reported for the Ghana Randomized Air Pollution and Health Study (GRAPHS), which included an LPG arm of 361 pregnant women. 16  Comparisons with exposure estimates from studies in similar regions also suggest the HAPIN intervention performed well in terms of exposure reductions. In Peru, the Cardiopulmonary outcomes and Health and Air Pollution (CHAP) trial reported a mean PM 2:5 exposures of 98 lg=m 3 for the primary biomassusing control arm, whereas the mean LPG-arm exposure was reported at 30 lg=m 3 (in comparison with means of 25-31 μg/m 3 in the control arm and 15 lg=m 3 in the intervention arm for the HAPIN Peru site). 17 In Rwanda, a trial of rocket-style cookstoves and water filters reported median exposures of 146 and 158 lg=m 3 in the control and intervention arms, respectively, for the primary cook (in comparison with 80 lg=m 3 in the control arm and 28-34 lg=m 3 post intervention for the HAPIN site in Rwanda). 10 A study of pregnant women in Guatemala reported median exposures of 148 lg=m 3 for open fire users and 55 lg=m 3 for those using LPG (in comparison with medians of 94-98 lg=m 3 in the control arm and 23-24 lg=m 3 in the intervention arm for the HAPIN site in Guatemala), 36 whereas in India the Tamil Nadu Air Pollution and Health Effects (TAPHE) cohort study of pregnant women estimated median PM 2:5 exposures of 75 lg=m 3 for biomass stove users and 46 lg=m 3 for those using primarily LPG (in comparison with 67-68 and 25-29 lg=m 3 in the control and intervention arms, respectively, for the HAPIN site in Tamil Nadu, India). 27 There are several potential reasons for the differences in the reported exposures, especially for PM 2:5 , between these studies and HAPIN. Perhaps most important is that, as an efficacy trial, HAPIN has a strong emphasis on supporting exclusive LPG use, with free provision and delivery of stoves and fuel supply. Consistent usage was supported by behavior change strategies and stove repair or maintenance. Continuous biomass stove use monitoring was conducted in all intervention households, with reinforcement of exclusive LPG use provided when any biomass stove use was detected in a participant's home. An analysis by Quinn et al. 37 of the HAPIN intervention fidelity and adherence found near-exclusive gas stove use through pregnancy in intervention households, with 86% of intervention homes reporting less than one biomass stove use per month.
Other contextual factors are also important. In the GRAPHS, for example, there was a high proportion of households cooking outdoors, which could imply lower baseline exposures, and homes were close together, which may have mitigated potential exposure contrasts due to "neighborhood" effects. 16 PURE was an observational study; although the groups provided a basis for comparison, there was no intervention effect to measure, and perhaps most important, stove use in the groupings was likely mixed, which could explain the higher exposures for the LPG users.
Finally, we note that our exposures for pregnant women in biomass-using homes (at baseline and post randomization in the control group) were also somewhat lower than typically reported (overall means ranging from 103-120 lg=m 3 across the different visits). The Pope et al. 6 review reported a pooled mean of 220 lg=m 3 for the baseline personal (biomass-using) exposures in the six LPG intervention studies; other reviews of HAP exposure have reported similar estimates. 5,38,39 It is possible that our field sites are contextually different from previous studies given prior formative work to identify locations with low background concentrations and relatively low-density housing. 18,19,22,23 Secular changes and/or differences in measurement approaches may be contributing to these differences, although it is unclear what and how these specific factors would result in these differences.

Multipollutant Relationships
Correlations between co-emitted pollutants have been used to justify measurement of HAP exposure proxies, most commonly CO as a surrogate for PM 2:5 8,40,41 CO is of interest given its relative ease of measurement in comparison with PM 2:5 , although a systematic review of this approach by Carter et al. 42 found that the PM 2:5 -CO exposure correlations varied widely (Pearson's R range 0.22-0.97). This broad range in the strength of the relationship is likely due to variability in combustion (including predominant fuel types and mixes contributing to HAP) and subsequent exposures, as well as the reliability of measurements. We present a data set of four unique settings where transitions from biomass (primarily wood) to LPG allow for a clear comparison between locations and fuel use types ( Figures S7  and S8). Our findings among biomass-using households fall into the middle of the range described by Carter et al. 42 and were stronger among biomass-using households (Spearman q = ∼ 0:5) than among LPG users (Spearman q = ∼ 0:06). Few analyses characterize HAP exposure relationships between PM 2:5 and BC, with the largest coming from the PURE study. They reported Spearman q correlations of 0.65-0.9, 35 which are similar to our findings.

Study Limitations
Although this study represents one of the largest efforts to characterize the impact of a household energy intervention on personal exposures, there are still several considerations for interpreting our findings. First, HAPIN is an efficacy trial, in which the stove, fuel, and support services were provided for free, resulting in high intervention fidelity and minimal stove stacking with biomass through pregnancy. 37 It is unclear whether exposure reductions with an LPG intervention, as reported here, could be achieved in most contexts without similar support. The field sites were also specifically vetted for their likelihood to have low background air pollution levels. 18,19 Although it is hoped that comprehensive, community-scale interventions may reduce HAP's contribution to ambient air quality, and thus further reduce exposures, this is a largely untested hypothesis. Further, for many areas background concentrations are high due to emissions from a variety of sources, 43,44 limiting potential exposure reductions for even the cleanest household energy interventions.
Although ∼ 9,000 exposure samples for each pollutant were collected, analyzed, and reported here, they represent only three snapshots of exposure over several months for households in diverse settings. Behavioral and environmental factors change over time, resulting in some risk of exposure misclassification. The measure of exposure instrument wearing compliance varied substantially between field research sites. It is, however, difficult to interpret precisely what the compliance metric means in terms of behavior and, as a result, whether the estimated exposure is indicative of a participant's "true" exposure. Still, the high intervention fidelity and relatively stable exposures evident in Figures  2 and 3 suggest that our limited measurements provided reasonable exposure estimates over the pregnancy period. A subset of more intensive measurements (twice the number of measurements in a random 10% of the study population reported here) is being conducted and will characterize how well our standard sampling protocol performs in predicting the longer and more intense exposure monitoring of the subset.
With the large number of samples being collected, some sample loss was inevitable. Approximately 19%, 16%, and 24% of the PM 2:5 , CO, and BC samples, respectively, were invalid due to being missing, equipment failure, damaged or misplaced filters, or failure to meet quality assurance criteria. The missingness described here-of data that should have been successfully collected during planned visits to households that occurred-is distinct from households leaving the study (presented in Table 2). This level of missing data is not unexpected, given the large-scale nature of the assessment and having been conducted across our four diverse international research sites. The PURE and GRAPHS studies, for example, both reported over 80% (exact figures were not provided) of their PM 2:5 samples as valid, with the GRAPHS also reporting between 47% and 70% of the CO deployments as valid across the various sampling sessions. 16,35 Our imputation analysis indicated that this missingness did not appreciable impact summary statistics or effect exposure estimates.
Finally, our study population for this analysis was pregnant women, a subgroup that has different behavioral considerations in comparison with others in the home. These exposures are clearly relevant for birth weight and other maternal and child health outcomes, but generalizability to other populations-or even for the same women post pregnancy-may be limited due to differences in behavior during pregnancy that may impact HAP exposure (e.g., cooking, occupational, domestic, childcare other tasks). 8

Conclusions
The results presented here suggest that an LPG intervention can substantially reduce pregnant women's exposures to healthdamaging pollutants. These exposure reductions represent, to our knowledge, some of the largest for a household energy intervention. Although HAPIN is an efficacy trial with specific contextual considerations that limit the generalizability of the results, our findings demonstrate that, in four geographic regions with different behavioral, sociocultural, and environmental contexts, it is possible for a clean fuel intervention to reduce personal PM 2:5 exposures to levels below the annual WHO-IT1 target.
These exposure reductions also suggest the potential for similar exposure contrasts throughout HAPIN for other participants, including the child born during the trial and nonpregnant adult women (ages 40-79 y) participants living in the same household as the pregnant women. Air pollution exposure for nonpregnant adult women was measured six times over the course of the study, with corresponding measures of blood pressure and collection of samples for biomarker analyses. Children resulting from the pregnancies were also measured for exposure three times over their first year of life, with additional measurements related to health (acute lower respiratory infection, anthropometry, and cognitive development) and collection of samples for biomarker analyses. Should the findings observed for pregnant women be similar for other adult women and children, this finding suggests that the HAPIN intervention can achieve substantial exposure reductions throughout the household.