First trimester gut microbiome induces Inflammation-dependent gestational diabetes phenotype in mice ==================================================================================================== * Yishay Pinto * Sigal Frishman * Sondra Turjeman * Adi Eshel * Meital Nuriel-Ohayon * Oren Ziv * William Walters * Julie Parsonnet * Catherine Ley * Elizabeth L. Johnson * Ron Schweitzer * Soliman Khatib * Faiga Magzal * Snait Tamir * Kinneret Tenenbaum Gavish * Samuli Rautava * Seppo Salminen * Erika Isolauri * Or Yariv * Yoav Peled * Eran Poran * Joseph Pardo * Rony Chen * Moshe Hod * Ruth E. Ley * Betty Schwartz * Eran Hadar * Yoram Louzoun * Omry Koren ## Abstract Gestational diabetes mellitus (GDM) is a condition in which non-diabetic women are diagnosed with glucose intolerance during pregnancy, typically in the second trimester. GDM can lead to a wide range of obstetrical and metabolic complications for both mother and neonate1. Early identification of GDM risk, along with a better understanding of its pathophysiology during the first trimester of pregnancy, may be effective in reducing GDM incidence, as well as its associated short and long term morbidities2. Here, we comprehensively profiled the gut microbiome, metabolome, inflammatory cytokines, nutrition and clinical records of 394 women during the first trimester of pregnancy. We found elevated levels of proinflammatory serum cytokines in those who later developed GDM. The women’s stool samples were also characterized by decreased levels of several fecal short-chain fatty acids and altered microbiome. We next tested the hypothesis that differences in GDM-associated microbial composition during the first trimester drove inflammation and insulin-resistance. Stool samples collected early in pregnancy from women from three populations who did and did not later develop GDM were transplanted to germ-free mice and confirmed that both inflammation and insulin-resistance are induced by the microbiome of pregnant women more than 10 weeks prior to GDM diagnosis. Following these observations, we used a machine-learning approach to predict GDM based on first trimester clinical, microbial and inflammatory markers. Our model showed high predictive accuracy. Overall, our results suggest that the gut microbiome of women in the first trimester plays a remarkable role in inflammation-induced GDM pathogenesis and point to dozens of GDM markers during the first trimester of pregnancy, some of which may be targets for therapeutic intervention. Keywords * Microbiome * gestational diabetes mellitus * pregnancy * germ-free mice ## Introduction Gestational diabetes mellitus (GDM, or glucose intolerance in non-diabetic women during pregnancy occurs when the pancreas cannot produce enough insulin to balance insulin-inhibiting effects of placental hormones (viz. estrogen, cortisol, and human placental lactogen)3. Approximatly 10% of pregnant women worldwide are diagnosed with GDM. Risk factors include ethnicity, maternal age, obesity, family history of diabetes and history of giving birth to large infants. Consequences of GDM include a wide range of obstetrical and metabolic complications for the mother (e.g. pre-eclampsia, type 2 diabetes, and cardiovascular diseases) and the neonate (mainly macrosomia and hypoglycemia)4. Fortunately, most of these complications are preventable when GDM is appropriately detected and managed and good glycemic control is achieved by nutrition, exercise and possibly insulin administration, along with heightened monitoring during labor and delivery5. As GDM incidence is increasing6–8, it is important to expand early-prediction efforts towards reducing negative consequences. To date, few studies have examined biomarkers of GDM in the first trimester (T1)9,10. We hypothesized that precursors to GDM can be found in the gut microbiota of pregnant women. Indeed, previous studies have associated gut microbial dysbiosis with diabetes 10,11 and a recent study has also associated gut dysbiosis with GDM in the third trimester12. Here, using a combination of omics tools, we find that women in T1, who later develop GDM, exhibit gut microbiota dysbiosis as well as increased proinflammatory serum cytokines and lower levels of fecal short-chain fatty acids (SCFAs). These markers can be used as parameters to predict GDM with an area under the curve value of 0.83. We anticipate that application of this algorithm will allow prediction, and possible prevention, of GDM with the ultimate goal of lowering its incidence. We also found that the T1 microbiome is responsible for some of the GDM phenotype features (insulin resistance and low grade inflammation) using a germ-free mouse model. ## Results & Discussion ### Cohort description and study design We recruited 394 women during T1 (Fig. 1A; gestational age [weeks + days]: 11+0-13+6), and collected blood and stool samples, as well as relevant clinical information and routine pregnancy follow-up data. Stool samples were used to profile the gut microbiome composition, short-chain fatty acids and metabolome. Plasma cytokines and hormones were profiled using blood samples. Additionally, participants were interviewed by a dietitian to assess eating habits and were asked to respond to lifestyle and stress questionnaires. We quantified the explained variance between these different measurements using a Mantel test. The T1 gut microbiome significantly explained the variance of most measurements and was most tightly correlated with the fecal metabolomic profile (Fig. 1B). We also recorded participants’ glucose tolerance test (GTT) scores from the late second (T2) and early third (T3) trimesters of pregnancy, which were used for GDM diagnosis. Predictably, 44 women (11%) were diagnosed with GDM (hereafter “GDM group”), the other 350 women with normal GTT tests served as the control group. We found that women diagnosed with GDM exhibited common risk factors (Table 1) such as higher maternal age and prepregnancy body mass index (BMI). ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/27/2021.09.17.21262268/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2021/09/27/2021.09.17.21262268/F1) Figure 1. First trimester blood and fecal biomarkers in women later diagnosed with GDM. (A) Sampling strategy and study design. Samples were collected in the first trimester of pregnancy. Stool was collected to profile gut microbiome, metabolome and SCFA and to validate results when transplanted into germ-free mice. Blood samples were used to profile cytokines and hormones. Life-style surveys and medical records were collected as well. (B) Variance explained (square of the Mantel statistic) between all pairs of data types, (Mantel test). (C) Serum levels of cytokines and hormones for GDM and control women. (FDR corrected Mann-Whitney U test). (D) Concentration of fecal short-chain fatty acids (FDR corrected Mann-Whitney U test). o-p<0.1, *p<0.05, **p<0.01, \***|p<0.001. View this table: [Table 1:](http://medrxiv.org/content/early/2021/09/27/2021.09.17.21262268/T1) Table 1: Cohort description ### GDM women exhibit elevated levels of serum inflammatory cytokines and low levels of SCFA in T1 Following the evidence of elevated inflammatory biomarkers in women diagnosed with GDM13, we profiled 10 plasma cytokines, chemokines and hormones in both the GDM and control groups. We found elevated levels of proinflammatory cytokines (IL-4, IL-6, IL-8, GM-CSF and TNF-a) among the GDM group (Fig. 1C; p<0.05, FDR corrected Mann-Whitney U test). Insulin resistance has been associated with elevated secretion of proinflammatory cytokines14 and indeed several studies demonstrated elevated levels of proinflammatory cytokines during T2 and T315–18. These altered cytokine profiles in GDM women several months prior to their diagnosis suggest that inflammation is associated with the pathogenesis of GDM. Another potential early biomarker for GDM is short chain fatty acids (SCFAs), which contribute to the maintenance of glucose homeostasis and suppression of inflammatory response. Hence, SCFAs are thought to play a role in obesity-induced inflammation leading to attenuation of insulin signaling and GDM19. We measured the concentration of 6 SCFAs in the stool of both GDM and control women using gas chromatography. We found a significant reduction of 2 branched SCFAs, isovalerate and isobutyrate, in stools from the GDM group (Fig. 1D; p<0.05 FDR-corrected Mann-Whitney U test) and a similar trend for valerate (p=0.09). Branched short-chain fatty acids (BSCFAs) are a product of bacterial fermentation of branched amino acids generated from undigested protein reaching the colon. BSCFAs, proposed markers for protein fermentation, were found to improve insulin sensitivity20,21 and reduce inflammation22. These findings, in line with several studies of later-stage pregnancy23 (but see24) suggest fecal BSCFAs are a potential biomarker for GDM in early stages of pregnancy. ### Gut microbiome plays a causal role in GDM pathogenesis Several studies have found altered gut microbiome composition in GDM women; most were based on samples collected post diagnosis25,26. We did not find differences in T1 ɑ-diversity between the groups following microbiome characterization using 16S rRNA gene sequencing. Principal coordinate analysis (PCoA) of unweighted UniFrac distances demonstrated that the microbial communities of healthy and GDM women trend toward significant differences (Fig 2A; p=0.06 PERMANOVA), supported by results of differential abundance analyses. We identified a subset of one bacterial species over-and 16 bacteria under-represented in the GDM group. As higher BMI and age are major risk factors for GDM and are widely associated with microbiome composition27,28, we repeated this analysis while controlling for these two confounding factors and found 15 under-represented species in the GDM group, only 6 of which intersect with the prior analysis (Fig 2B). Controlling for confounding variables allowed us to distinguish between microbial species associated with the main risk factors. As an illustrative example, *Akkermansia muciniphila* which is consistently negatively correlated with obesity29 is prima facie negatively associated with GDM when not controlling for the difference in BMI between the groups. Differently, *Prevotella copri* which is known to play a role in glucose homeostasis30 and has been reported to be more abundant in women diagnosed with GDM 25,31 was found to be under-represented in GDM women after controlling for confounders. A recent study demonstrated that *Prevotella* was a marker of positive glucose metabolism32. Kovatcheva-Datchary *et al*. even showed in a clinical trial that *Prevotella* protected against *Bacteroides*-induced glucose intolerance and that improvement in glucose metabolism was associated with increased abundance of *Prevotella**33*. This improved glucose metabolism by presence of *Prevotella* was also demonstrated by supplementing mice with *P. copri*. One possible mechanism, recently proposed in rats, is that *P. copri* improves glucose homeostasis through farnesoid X receptor signalling and increased bile acid metabolism34. Hence, the lower abundance of *Prevotella* in T1 samples of women who will develop GDM may be linked to the development of insulin resistance. ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/27/2021.09.17.21262268/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2021/09/27/2021.09.17.21262268/F2) Figure 2. Differences in fecal microbiome composition in the first trimester of pregnancy between women developing and not developing GDM. (A) Principal coordinates analysis based on 16S profiling of the microbiome using the unweighted UniFrac dissimilarity metric colored by GDM/control (left; p=0.06 PERMANOVA), Shannon diversity (top right; R2=0.24 with PCo1) and two phyla that mostly explain the PCo1 and PCo2 variance-Fusobacteria (R2=0.08 with PCo2) Deferribacteres (R2=0.3 with PCo2). (B) Cladogram represents the microbial features associated with the disease state, while controlling for the main risk factors, BMI and age, at all taxonomic ranks. Spearman’s rank correlation for each association, positive association (all associations found) implies over-represented features in the healthy control group. Cladogram and bars are colored by phylum. To confirm the causal role of the gut microbiome in the pathogenesis of GDM, fecal samples of BMI-matched GDM and control samples were transplanted to germ-free female mice. We then profiled the gut microbiome of the mice seven days post transfer (Fig 3A) and found the microbial communities to be significantly different between the groups (Fig 3B). Consistent with our observation in women, *P. copri* was found to be reduced in GDM-recipient mice (Fig 3C). To characterize their glucose metabolism, we performed glucose tolerance tests 21 days post transfer. GDM-recipient mice exhibited impaired glucose tolerance (Fig. 3D). Further, we profiled the mice’s plasma cytokines; similar to our observation in women, GDM-recipient mice exhibited elevated levels of both IL-6 and IL-10 relative to the control-recipient mice (Fig 3E). We confirmed the role of gut microbiota in GDM pathogenesis in two additional GDM cohorts (Finnish and American women) by conducting 6 similar germ-free transplantation experiments on 4 different mice groups (Fig 3E; Supplementary Figure S1; Supplementary tables S1-S3; p=0.23,0.008,0.15,0.25 for timepoints 0,30,60,120 respectively, Fisher’s method). Considering these observations, we conclude that gut microbes play a causal role in the development of some of the phenotypes of GDM in pregnant women during T1 and that their role is likely universal as demonstrated by conservation across cohorts. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/27/2021.09.17.21262268/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2021/09/27/2021.09.17.21262268/F3) Figure 3. First trimester fecal microbiota transplantation to germ-free mice confirms the causal role of the gut microbiome in the pathogenesis of GDM at early stages of pregnancy. (A)Study design. (B) Principal coordinates analysis using the unweighted UniFrac metric. Mice receiving transplants from GDM women exhibit different profiles from mice receiving transplants from the control group (p=0.005; PERMANOVA test). *p.copri*, which was found to be negatively associated with GDM women, is negatively associated with GDM-transplanted mice as well (p=0.04, linear mixed-effects model). (D) Intraperitoneal glucose tolerance test (ipGTT) exhibit impaired glucose sensitivity in mice transferred from GDM women in this study and in the Finnish cohort (insert), (Error bars represent ±SEM; *p0.05 one-tailed Mann-Whitney U test). (E) Serum levels cytokines in transferred mice (*p<0.05 Mann-Whiteny U test). ### Lower levels of short peptides in stool of GDM women We next compared stool metabolome profiles of women who would and would not later develop GDM in a BMI-matched subset of the cohorts (n=15 pairs). First, we found significant correlation between the microbiome and metabolome (r=0.26, p=0.02; Mantel test) of these samples. Although we were limited by the number of matched samples, by manual exploration of the data, we noticed that many short peptides had differential concentrations (p<0.05) between control and GDM women. Following a curation of all di- and tri-peptides, the vast majority of the peptides (50 out of 52) with significant differential concentrations showed a clear tendency of depletion in GDM women relative to healthy control women (Fig 4A,B). These peptides are enriched with the hydrophobic amino acids tyrosine, phenylalanine and alanine (p=8×10−4, 0.01,0.01 respectively, FDR corrected Fisher’s exact test; Fig 4C). Plasma levels of these three amino-acids have also been reported to be significantly associated with diabetes35. One study found a link between elevated blood levels of these three amino acids and decreased insulin secretion35. Interestingly, another study published recently found elevated levels of alanine and tyrosine in maternal blood at 12-16 gestational weeks in women later diagnosed with GDM36 and alanine is also used by the liver for gluconeogenesis37. Increased amino acid levels in the blood may result in lower levels excreted in stool38. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/27/2021.09.17.21262268/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2021/09/27/2021.09.17.21262268/F4) Figure 4. Analysis of T1 fecal metabolomics exhibits lower levels of dipeptides for GDM women. (A) volcano plot of all metabolites examined in this study. Peptides are colored in red. Heatmap of the 52 differential peptides. Each row denotes a sample (grouped by disease state) and each column denotes peptide. Z-scores were calculated per column. Peptides (columns) were hierarchically clustered based on euclidean distance. (C) Amino acid composition of the differentially abundant peptides. Bars (left y-axis) represents odds ratio for each amino acid and dots (right y-axis) represents the amino acid count in the differentially abundant peptides. ### Accurate prediction of GDM Finally, we built a machine-learning model (XGboost) to predict GDM based on microbiome composition, cytokine profile, medical history, and dietary features, all collected during T1 (Fig 1). For this aim, we also included T1 clinical data of 66 additional women, recruited in later stages of their pregnancy. Our XGboost model predicts GDM with very high accuracy (auROC=0.83; Fig. 5). When making predictions based on only a single feature, we found the highest accuracy when using medical records alone (though still 7% lower than our combined model), in agreement with a recent study39. Fecal microbiome features resulted in the second highest accuracy (auROC=0.73). To better capture the contribution of microbial features, we conducted a two stage approach (see methods). First, we made predictions using clinical features alone (stage 1). Then we filtered out samples with the highest probability to develop GDM and predicted diagnosis for the remaining samples based on the microbiome features alone (stage 2). The odds-ratio between the stages was improved from 3.2 to 4 demonstrating the potential for more accurate prediction using fecal microbiome profile, especially relevant if medical records are incomplete or unavailable. On the other hand, 86 nutritional characteristics yielded relatively low predictive accuracy (auROC=0.64). Further, no differences were found in dietary habits between the groups (Supplemental table S4); thus, these findings suggest that differences in food consumption during T1 contribute minimally to the pathogenesis of GDM. To conclude, our combined model predicts GDM, with high accuracy, 10-15 weeks prior to the diagnosis. ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/27/2021.09.17.21262268/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2021/09/27/2021.09.17.21262268/F5) Figure 5. Highly accurate prediction of pre-GDM women during the first trimester of pregnancy. Area under the receiver operating characteristic curve (auROC) for each combination of features. Error bars represent ±SD. ## Conclusion We found broad and consistent evidence that GDM pathology begins as early as T1 in a large cohort of pregnant women. Additionally, we successfully demonstrated that the precursors of GDM originate in the gut microbiota, evident from phenotypic transfer following fecal microbial transplant from three unique cohorts to germ-free mice. Our findings suggest that GDM is induced through heightened inflammation, initiated from microbial dysbiosis. Accordingly, addition of microbiome data to a machine-learning model improved our ability to predict GDM and can even serve as a stand alone snapshot predictor. These results may be of use in the future when exploring preventive measures for GDM. ## Methods ### Patient recruitment A total of 394 healthy pregnant women (ages 18-40) were recruited at 11+0-13+6 gestational weeks at women’s health centers of Clalit HMO (Dan Petach Tikva District, Israel) between the years 2016-17. Informed consent was obtained from all participants, in accordance with Clalit’s institutional review board approval No.0135-15-COM. Exclusion criteria included: Type 1 or Type 2 diabetes mellitus diagnosed before pregnancy; IVF or hormonal therapy in the previous 3 months; use of antibiotics in the previous 3 months; and multiple gestation. Another 44 women were recruited at 24-28 gestational weeks after GDM diagnosis at Rabin Medical Center between the years 2016-17. Informed consent was obtained from all participants in accordance with Rabin Medical Center institutional review board approval No.0263-15-RMC. Inclusion and exclusion criteria were the same as the main cohort. ### Sample collection Fecal samples were collected from participants at 11-14 gestational week, close to recruitment, and frozen immediately at -80°C until processing. Blood was collected in EDTA tubes at recruitment. The tubes were centrifuged at 4°C for 10 min at 3600rpm; plasma was then stored in coded 1.7 ml tubes at -80°C until processing. ### Medical records and questionnaire Maternal demographics, clinical and obstetrical data, including pregnancy follow up and comorbidities were extracted from medical records. Fasting glucose, liver enzymes, and HbA1C were measured in blood, and weight and height were taken at the time of recruitment. Dietary consumption collected by 24-hour recall together with physical activity, sleeping hours and employment details were also recorded. Stress levels were assessed with a validated questionnaire40. ### Microbiome sequencing and analysis DNA was extracted from all samples using the PowerSoil DNA extraction kit (MoBio, Carlsbad, CA, USA) according to the manufacturer’s instructions and following a 2 minute bead beating step (BioSpec, Bartlesville, OK, USA). Purified DNA was used for PCR amplification of the variable V4 region using the 515F and 806R barcoded primers following the Earth Microbiome Project protocol41. For each PCR reaction, the following materials were added: 25µl (∼40ng/ µl) DNA (sample), 2 μl 515F (forward, 10μM) primer, 2 μl 806R (reverse, 10μM) primer, and 25 µl PrimeSTAR Max PCR Readymix (Takara, Mountain View, CA, USA). PCR reactions were carried out by 30 cycles of denaturation at 98°C for 10 seconds, annealing at 55°C for 5 seconds, extension at 72°C for 20 seconds and then a final elongation at 72°C for 1 minute. Amplicons were purified using AMPure magnetic beads (Beckman Coulter, Brea, CA, USA) and quantified using the Picogreen dsDNA quantitation kit (Invitrogen, Carlsbad, CA, USA). Then, equimolar amounts of DNA from individual samples were pooled and sequenced using the Illumina MiSeq platform and MiSeq Reagent Kit V2 (500 cycles) at the Genomic Center at the Bar-Ilan University Azrieli Faculty of Medicine, Israel. Microbial diversity and composition were calculated using QIIME2 version 2019.442. First, single-end sequences were imported (qiime import) and demultiplexed (qiime demux) with golay error correction. Next, sequences were denoised using DADA2 (qiime dada2 denoise-single), trimming the first 5 bases and truncating each sequence at position 215. Feature tables and representative sequences from the different sequencing runs were then merged. A phylogenetic tree was constructed using the fragment-insertion method (qiime fragment-insertion sepp). Taxonomic classification was done using a naï ve-based classifier trained on the 99% Greengenes 13_8 V4 reference set (qiime feature-classifier classify-sklearn). In order to remove low-confidence features, only features with frequency higher than 50 in at least 5 samples were kept. In addition features that contained mitochondria or chloroplast sequences or that were not assigned to a phylum were filtered out. Diversity analysis was performed with a sampling depth of 8,000 sequences (qiime diversity core-metrics-phylogenetic). Differences in alpha-diversity (Shannon’s diversity index) were tested using a Kruskal-Wallis test (implemented in qiime diversity alpha-group-significance). Unweighted UniFrac was used as a metric of paired distance between samples (beta-diversity), and the permutation-based PERMANOVA test was performed (qiime diversity beta-group-significance) to test whether distances between samples within a group (GDM/control), are more similar to each other than they are to samples from the other group. To associate microbial features with GDM, features were collapsed to the different taxonomic levels from phylum to species. Spearman rank correlations were used to identify associations between the disease state for each microbial feature at each taxonomic level. Disease state labels were mixed 1000 times to receive a background distribution, and only correlations with p<0.01 were preserved. To control for the main risk factors of GDM, age and BMI, we detected adjusted association by building a linear model and performed Spearman rank correlations, as described, on the linear regression residuals. Microbial diversity and composition for the fecal samples from transplanted mice were calculated as described above (with sampling depth of 16,000 sequences). Microbial features were associated with GDM using MaAsLin2 to perform per feature linear mixed effects model. Features were first log transformed and were subjected to CSS normalization. Disease state (GDM/control) and days-post-FMT were used as fixed effects while cage and donor were included as random effects following Eq.1. ![Formula][1] ### Cytokines and hormone measurement in plasma Human cytokine (TNF-α, IFN-γ, GM-CSF, IL-2, IL-4 IL-6, IL-8 and IL-10) levels were measure in plasma using the Bio-Plex Pro Human Cytokine 8-Plex Panel (Bio-Rad Laboratories Inc., Irvine, CA, USA) according to the manufacturer’s instructions. The fluorescent signal was measured on a MagPix reader. Analyte concentrations were calculated using standard curves in the Bio-Plex Manager Software (Bio-Rad). Values out of range (below/above) were imputed with the minimal/maximal in range values respectively. ### Short-chain fatty acids profiling SCFA extraction and analysis was performed on 40 samples, as follows. An aliquot of 0.250 gr of wet feces was thawed and suspended in 1 ml of an orthophosphoric acid solution (8% v/v) and kept at room temperature for 10 min with occasional shaking. The mixture was homogenized for 2 min, and the suspension was centrifuged at 4°C for 15 min at 14,000 rpm. The supernatant was filtered by additional centrifugation at 4°C for 15 min at 14,000 rpm. Next, 225 μl of the supernatant were transferred into a polypropylene tube, and 25 μl of 2-methyl-butyric-acid (Sigma-Aldrich (Merck), St. Louis, MO, USA) were added as an internal standard (IS) to a final concentration of 0.001M and transferred to a chromatographic vial for gas chromatography analyses. The IS was used to correct for injection variability between samples and for minor changes in the instrument response. Vials were stored at -20°C before GC analysis. A standard mix (WSFA-4, Sigma-Aldrich, St. Louis, MO, USA) was used to determine the concentrations of propionic acid. Standard curves for acetic acid and butyric acid (Sigma-Aldrich, St. Louis, MO, USA) were prepared using stock solutions of both acids, separately. Gas chromatography analysis was then performed. Chromatographic analyses were carried out using the Agilent Technologies 6890, a GC system with a mass selective detector. A fused-silica capillary column with a free fatty acid phase (DB-FFAP 122-3232, 30 m×0.25 mm×0.25 um) was used. The carrier gas was helium at a flow rate of 13.6 mL/min. The initial oven temperature was 70°C, raised to 100°C at a rate of 20°C/min, then raised to 180°C at 8°C/min and held for 3 min, before then being raised to 230°C at 20°C/min. The injection volume was 1 μL and the run time of a single analysis was 17 min. The SCFA analysis was carried out at MIGAL Galilee Research Institute, Israel. ### Untargeted metabolomics In addition to short chain fatty acid extraction, untargeted metabolomics was performed on fecal samples from 15 women who would go on to develop GDM and 15 who would not, with similar age and BMI. The metabolomics analysis was performed at MIGAL Galilee Research Institute, Israel and Tel Hai College, Israel. Fecal samples were extracted using methanol (0.333 mg/ml of MeOH), vortexed, and centrifuged. The supernatant was collected and filtered before injection to the LC-MS/MS instrument. A pooled matrix prepared by mixing a small volume (20 µl) of each experimental sample was used as a quality control (QC) for batch normalization and compound identification. The samples were injected (5 μL) into UHPLC connected to a photodiode array detector (Dionex Ultimate 3000, Thermo Fisher Scientific, Sunnyvale, CA, USA), with a reverse-phase column (ZORBAX Eclipse Plus C18; Agilent, Santa Clara, CA, USA; 100*3.0 mm; 1.8 μm). The mobile phase consisted of (A) DDW with 0.1% formic acid and (B) acetonitrile containing 0.1% formic acid. The gradient was initiated with 2% B which was increased to 30% B over 4 min, and then increased to 40% B over 1 min before being kept isocratic at 40% B for another 3 min. Then, the gradient increased to 50% over 6 min, and to 55% over another 4 min and to 95% over 5 min and kept isocratic for 7 min. Finally phase B was returned to 2% over3 min and the column was allowed to equilibrate at 2% B for 3 min before the next injection. The flow rate was 0.4 mL/min. Blank (methanol) and QC samples were injected at the start of the sequence, after each 10 samples, and at the end of the sequence. LC–MS/MS analysis was performed with a Heated Electrospray ionization (HESI-II) source connected to a Q Exactive™ Plus Hybrid Quadrupole-Orbitrap™ Mass Spectrometer, Thermo Scientific™, Germany. ESI capillary voltage was set to 3500 V, capillary temperature to 300°C, gas temperature to 350°C and gas flow to 10 mL/min. The mass spectra (m/z 100–1500) were acquired using both positive and negative ion modes. Data dependent MS2 analysis was generated for the QC samples and used for compound identification. Downstream analysis and data processing were performed with the Thermo Scientific™ Compound Discoverer™ program, version 3.1.0.305. (Mass tolerance ≤ 5ppm; intensity tolerance≤ 30%; S/N threshold= 3; minimum peak intensity= 1000000; RT tolerance≤ 0.2min.) Databases used for identification were Chemspider43, MzCloud44 and KEGG45. Differential abundances of the metabolites between the groups were identified by log transformation of the peak areas followed by student’s t-tests and FDR correction. Short peptides were manually curated using the metabolite name and using a list of dipeptides downloaded from PubChem database46. Enrichment of amino acids was calculated using Fisher’s exact test with the following contingency table groups: amino acid of interest, all other amino acids, peptides enriched in GDM, peptides not enriched in GDM. ### Fecal transplantation to germ free mice We used the model of fecal gut microbiome transplants to germ-free mice as conducted previously47–49. All experiments involving mice were performed using protocols approved by the local animal ethics committee at Bar-Ilan University (number 33-04-2018). Briefly, eight-week-old germ-free (GF) female Swiss Webster mice were maintained in isolators under a strict 12h light:12h dark cycle with estrous cycles synchronized to minimize mice hormonal variation. Mice were fed an autoclaved chow diet (Harlan-Teklad, Madison, WI) *ad libitum*. Stool samples from T1 healthy pregnant women and from women who were later diagnosed with GDM were selected based on age-and BMI-matching without *a priori* knowledge of bacterial diversity. Each fecal sample (0.1 g) was suspended in 1.5 ml of reduced sterile PBS, vortexed for 5 min and settled for 5 min to allow larger particles to settle to the bottom of the tube. Handling of human fecal samples was performed under anaerobic conditions. Mice were divided into six groups with equal weights and then immediately gavaged with 200 μl of fecal slurries from the 6 study groups. Mice were then placed in ventilated cages, 3-4 mice per cage, and followed for 4 weeks. Body weight and chow consumption were monitored weekly. Fecal pellets were collected at days 7, 14 and 21, snap-frozen in liquid N2 and stored at -80°C for analysis of microbial communities. On day 21, intraperitoneal glucose tolerance test (ipGTT) was performed by an injection of 2 g/kg body weight glucose after an 8 h fast. Tail blood samples were collected at 0, 15, 30, 60, 90, and 120 minutes and blood glucose levels determined. On day 29, mice were sacrificed after collecting blood samples and ceca. Blood was utilized for assessment of cytokines and hormones, as described above. #### FMT for additional cohorts Finnish cohort-PGD1 study: stool samples were obtained from T1 for 6 women diagnosed with GDM matched to 6 healthy controls obtained from Finland50. Twelve 6-8 week old female GF Swiss Webster mice were gavaged with stool sample slurries prepared under anoxic conditions as previously described. An oral glucose tolerance test was administered 12 days post inoculation. Glucose dose was 2g/kg, readings were at 0, 30, 60, 120 minutes, via ACCU CHEK Compact Plus (Roche, Mannheim, Germany). PGD2: repeat of PGD1 with 12 mice aged 6-8 weeks and 12 mice aged 11-13 week old mice (total of 48). Oral GTT was performed on Day 19. US-American cohort-The women were selected from the STORK study51. PGD3: this experiment used 8 week old mice and 12 week old mice. As above, each donor sample was inoculated into one 8 week old and one 12 week old mouse. An oral glucose tolerance test was administered 12 days post inoculation. Glucose dose was 2g/kg, readings were at 0, 30, 60, 120 minutes, as above. PGD4: this experiment is a repeat of PDG3 using 6 week old mice. An oral glucose tolerance test was administered 12 days post inoculation. Glucose dose was 2g/kg, readings were at 0, 30, 60, 120 minutes, as above. ### Statistical analysis Unless otherwise specified, statistical analysis was done using the non-parametric Mann-Whitney U tests followed by FDR correction as implemented in the scipy stats library of python52. Mantel’s correlation between features and unweighted UniFrac distances was used as the metric for microbial dissimilarity. For all other features, data was log transformed and min-max normalized, and euclidean distance was used as the distance metric. 9999 permutations of label mixing were done; the p-value was calculated as the proportion of these permutations that lead to a higher explained variance. ### Prediction To predict GDM based on T1 information, we checked all combinations of the following components: 1) Cytokines, 2) microbiome,3) general clinical information and 4) food questionnaires. All combinations were tested resulting in 16 possible combinations. The accuracy of the prediction was assessed using the Area Under Curve (AUC) of the test set, in a 20%/80% test/training sets division and a five fold cross validation. The microbiome was merged into a genus level representation and log transformed and merged using the standard parameters of the MIPMLP pipeline53. For the other components, all non-numerical values were replaced by a one-hot representation. All missing values were replaced by the median value of the same category. All values were z-scored to an average of 0 and a standard deviation of 1. We used a binary XGBoost with a learning rate of 0.001, 200 estimators, gblinear classifiers, a logistic loss function, a lambda regularization of 0.01, and gamma regularization of 0.1 in the XGBclassifier function of the xgboost python package. All other parameters were the default of the function. The binary outcome was whether the woman developed GDM eventually. We limited the external feature analysis to features informative on the training set in the first cross validation (p value <0.1 Pearson correlation in the training test with the outcome). The resulting feature used were: ![Formula][2] When combining different types of inputs for the classification, the inputs were concatenated. ## Supporting information Figure S1 [[supplements/262268_file02.pdf]](pending:yes) Table S4 [[supplements/262268_file03.xlsx]](pending:yes) Table S1 [[supplements/262268_file04.xlsx]](pending:yes) Table S2 [[supplements/262268_file05.xlsx]](pending:yes) Table S3 [[supplements/262268_file06.xlsx]](pending:yes) ## Data Availability Data will be made public once the paper is accepted * Received September 17, 2021. * Revision received September 17, 2021. * Accepted September 27, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. 1.Johns, E. C., Denison, F. C., Norman, J. E. & Reynolds, R. M. Gestational Diabetes Mellitus: Mechanisms, Treatment, and Complications. Trends Endocrinol. Metab. 29, 743–754 (2018). 2. 2.Immanuel, J. & Simmons, D. Screening and Treatment for Early-Onset Gestational Diabetes Mellitus: a Systematic Review and Meta-analysis. Curr. Diab. Rep. 17, 115 (2017). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 3. 3.Baz, B., Riveline, J.-P. & Gautier, J.-F. ENDOCRINOLOGY OF PREGNANCY: Gestational diabetes mellitus: definition, aetiological and clinical aspects. Eur. J. Endocrinol. 174, R43–51 (2016). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 4. 4.Plows, J. F., Stanley, J. L., Baker, P. N., Reynolds, C. M. & Vickers, M. H. The Pathophysiology of Gestational Diabetes Mellitus. Int. J. Mol. Sci. 19, (2018). 5. 5.Lende, M. & Rijhsinghani, A. Gestational Diabetes: Overview with Emphasis on Medical Management. Int. J. Environ. Res. Public Health 17, (2020). 6. 6.Zhu, Y. & Zhang, C. Prevalence of Gestational Diabetes and Risk of Progression to Type 2 Diabetes: a Global Perspective. Curr. Diab. Rep. 16, 7 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s11892-015-0699-x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26742932&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 7. 7.Eades, C. E., Cameron, D. M. & Evans, J. M. M. Prevalence of gestational diabetes mellitus in Europe: A meta-analysis. Diabetes Res. Clin. Pract. 129, 173–181 (2017). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 8. 8.Casagrande, S. S., Linder, B. & Cowie, C. C. Prevalence of gestational diabetes and subsequent Type 2 diabetes among U.S. women. Diabetes Res. Clin. Pract. 141, 200–208 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.diabres.2018.05.010&link_type=DOI) 9. 9.Tenenbaum-Gavish, K. et al. First trimester biomarkers for prediction of gestational diabetes mellitus. Placenta 101, 80–89 (2020). 10. 10.Ma, S. et al. Alterations in Gut Microbiota of Gestational Diabetes Patients During the First Trimester of Pregnancy. Front. Cell. Infect. Microbiol. 10, 58 (2020). 11. 11.Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55–60 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11450&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23023125&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000309446800031&link_type=ISI) 12. 12.Crusell, M. K. W. et al. Gestational diabetes is associated with change in the gut microbiota composition in third trimester of pregnancy and postpartum. Microbiome 6, 89 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s40168-018-0472-x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29764499&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 13. 13.Chen, X., Stein, T. P., Steer, R. A. & Scholl, T. O. Individual free fatty acids have unique associations with inflammatory biomarkers, insulin resistance and insulin secretion in healthy and gestational diabetic pregnant women. BMJ Open Diabetes Res Care 7, e000632 (2019). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiYm1qZHJjIjtzOjU6InJlc2lkIjtzOjExOiI3LzEvZTAwMDYzMiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzI3LzIwMjEuMDkuMTcuMjEyNjIyNjguYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 14. 14.Esser, N., Legrand-Poels, S., Piette, J., Scheen, A. J. & Paquot, N. Inflammation as a link between obesity, metabolic syndrome and type 2 diabetes. Diabetes Res. Clin. Pract. 105, 141–150 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.diabres.2014.04.006&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24798950&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 15. 15.Ross, K. M. et al. Patterns of peripheral cytokine expression during pregnancy in two cohorts and associations with inflammatory markers in cord blood. Am. J. Reprod. Immunol. 76, 406–414 (2016). 16. 16.Denney, J. M. et al. Cytokine profiling: variation in immune modulation with preterm birth vs. uncomplicated term birth identifies pivotal signals in pathogenesis of preterm birth. J. Perinat. Med. 49, 299–309 (2021). 17. 17.Žák, P. & Souček, M. Correlation of tumor necrosis factor alpha, interleukin 6 and interleukin 10 with blood pressure, risk of preeclampsia and low birth weight in gestational diabetes. Physiol. Res. 68, 395–408 (2019). 18. 18.Sudharshana Murthy, K. A., Bhandiwada, A., Chandan, S. L., Gowda, S. L. & Sindhusree, G. Evaluation of Oxidative Stress and Proinflammatory Cytokines in Gestational Diabetes Mellitus and Their Correlation with Pregnancy Outcome. Indian J. Endocrinol. Metab. 22, 79–84 (2018). 19. 19.Chambers, E. S., Preston, T., Frost, G. & Morrison, D. J. Role of Gut Microbiota-Generated Short-Chain Fatty Acids in Metabolic and Cardiovascular Health. Curr. Nutr. Rep. 7, 198–206 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s13668-018-0248-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 20. 20.Heimann, E., Nyman, M., Pålbrink, A.-K., Lindkvist-Petersson, K. & Degerman, E. Branched short-chain fatty acids modulate glucose and lipid metabolism in primary adipocytes. Adipocyte 5, 359–368 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/21623945.2016.1252011&link_type=DOI) 21. 21.Haufe, S. et al. Branched-chain amino acid catabolism rather than amino acids plasma concentrations is associated with diet-induced changes in insulin resistance in overweight to obese individuals. Nutr. Metab. Cardiovasc. Dis. 27, 858–864 (2017). 22. 22.Yao, C. K., Muir, J. G. & Gibson, P. R. Review article: insights into colonic protein fermentation, its modulation and potential health implications. Aliment. Pharmacol. Ther. 43, 181–196 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/apt.13456&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 23. 23.Chorell, E. et al. Pregnancy to postpartum transition of serum metabolites in women with gestational diabetes. Metabolism 72, 27–36 (2017). 24. 24.Pappa, K. I. et al. Intermediate metabolism in association with the amino acid profile during the third trimester of normal pregnancy and diet-controlled gestational diabetes. Am. J. Obstet. Gynecol. 196, 65.e1–5 (2007). 25. 25.Ponzo, V. et al. Diet-Gut Microbiota Interactions and Gestational Diabetes Mellitus (GDM). Nutrients 11, (2019). 26. 26.Ferrocino, I. et al. Changes in the gut microbiota composition during pregnancy in patients with gestational diabetes mellitus (GDM). Sci. Rep. 8, 12216 (2018). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 27. 27.Stanislawski, M. A., Dabelea, D., Lange, L. A., Wagner, B. D. & Lozupone, C. A. Gut microbiota phenotypes of obesity. NPJ Biofilms Microbiomes 5, 18 (2019). 28. 28.O’Toole, P. W. & Jeffery, I. B. Gut microbiota and aging. Science 350, 1214–1215 (2015). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEzOiIzNTAvNjI2NS8xMjE0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjcvMjAyMS4wOS4xNy4yMTI2MjI2OC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 29. 29.Xu, Y. et al. Function of in Obesity: Interactions With Lipid Metabolism, Immune Response and Gut Systems. Front. Microbiol. 11, 219 (2020). 30. 30.Cani, P. D. Human gut microbiome: hopes, threats and promises. Gut 67, 1716–1725 (2018). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ3V0am5sIjtzOjU6InJlc2lkIjtzOjk6IjY3LzkvMTcxNiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzI3LzIwMjEuMDkuMTcuMjEyNjIyNjguYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 31. 31.Hasain, Z. et al. Gut Microbiota and Gestational Diabetes Mellitus: A Review of Host-Gut Microbiota Interactions and Their Therapeutic Potential. Front. Cell. Infect. Microbiol. 10, 188 (2020). 32. 32.Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321–332 (2021). 33. 33.Kovatcheva-Datchary, P. et al. Dietary Fiber-Induced Improvement in Glucose Metabolism Is Associated with Increased Abundance of Prevotella. Cell Metab. 22, 971–982 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cmet.2015.10.001&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26552345&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 34. 34.Péan, N. et al. Dominant gut Prevotella copri in gastrectomised non-obese diabetic Goto-Kakizaki rats improves glucose homeostasis through enhanced FXR signalling. Diabetologia 63, 1223–1235 (2020). 35. 35.Vangipurapu, J., Stancáková, A., Smith, U., Kuusisto, J. & Laakso, M. Nine Amino Acids Are Associated With Decreased Insulin Secretion and Elevated Glucose Levels in a 7.4-Year Follow-up Study of 5,181 Finnish Men. Diabetes 68, 1353–1358 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2337/db19-1353-P&link_type=DOI) 36. 36.Jiang, R. et al. Amino acids levels in early pregnancy predict subsequent gestational diabetes. J. Diabetes 12, 503–511 (2020). 37. 37.Bloomgarden, Z. Diabetes and branched-chain amino acids: What is the link? J. Diabetes 10, 350–352 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/1753-0407.12645&link_type=DOI) 38. 38.West, K. A. et al. Longitudinal metabolic and gut bacterial profiling of pregnant women with previous bariatric surgery. Gut 69, 1452–1459 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ3V0am5sIjtzOjU6InJlc2lkIjtzOjk6IjY5LzgvMTQ1MiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzI3LzIwMjEuMDkuMTcuMjEyNjIyNjguYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 39. 39.Artzi, N. S. et al. Prediction of gestational diabetes based on nationwide electronic health records. Nat. Med. 26, 71–76 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-019-0724-8&link_type=DOI) 40. 40.Cohen, S., Kamarck, T. & Mermelstein, R. A global measure of perceived stress. J. Health Soc. Behav. 24, 385–396 (1983). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/2136404&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=6668417&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1983RZ58200008&link_type=ISI) 41. 41.Consortium, T. H. M. P. & The Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature vol. 486 207–214 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11234&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22699609&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000305189000025&link_type=ISI) 42. 42.Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41587-019-0209-9&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31341288&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 43. 43.Pence, H. E. & Williams, A. ChemSpider: An Online Chemical Information Resource. Journal of Chemical Education vol. 87 1123–1124 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1021/ed100697w&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000282868100002&link_type=ISI) 44. 44.mzCloud – Advanced Mass Spectral Database. [https://www.mzcloud.org/](https://www.mzcloud.org/). 45. 45.Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/28.1.27&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10592173&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000084896300007&link_type=ISI) 46. 46.Kim, S. et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 49, D1388–D1395 (2021). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gkaa971&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33151290&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 47. 47.Uzan-Yulzari, A. et al. Neonatal antibiotic exposure impairs child growth during the first six years of life by perturbing intestinal microbial colonization. Nat. Commun. 12, 443 (2021). 48. 48.Binyamin, D. et al. The aging mouse microbiome has obesogenic characteristics. Genome Med. 12, 87 (2020). 49. 49.Uzan-Yulzari, A. et al. The intestinal microbiome, weight, and metabolic changes in women treated by adjuvant chemotherapy for breast and gynecological malignancies. BMC Med. 18, 281 (2020). 50. 50.Koren, O. et al. Host remodeling of the gut microbiome and metabolic changes during pregnancy. Cell 150, 470–480 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2012.07.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22863002&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000307301400008&link_type=ISI) 51. 51.Ley, C. et al. Stanford’s Outcomes Research in Kids (STORK): a prospective study of healthy pregnant women and their babies in Northern California. BMJ Open 6, e010810 (2016). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMToiNi80L2UwMTA4MTAiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wOS8yNy8yMDIxLjA5LjE3LjIxMjYyMjY4LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 52. 52.Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41592-019-0686-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32015543&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F27%2F2021.09.17.21262268.atom) 53. 53.Jasner, Y., Belogolovski, A., Ben-Itzhak, M., Koren, O. & Louzoun, Y. Microbiome Preprocessing Machine Learning Pipeline. Front. Immunol. 12, 677870 (2021). [1]: /embed/graphic-7.gif [2]: /embed/graphic-8.gif