Integrating endometrial proteomic and single cell transcriptomic pipelines reveals distinct menstrual cycle and endometriosis-associated molecular profiles

Endometriosis is a debilitating gynecological disorder affecting approximately 10% of the female population. Despite its prevalence, robust methods to classify and treat endometriosis remain elusive. Changes throughout the menstrual cycle in tissue size, architecture, cellular composition, and individual cell phenotypes make it extraordinarily challenging to identify markers or cell types associated with uterine pathologies since disease-state alterations in gene and protein expression are convoluted with cycle phase variations. Here, we developed an integrated workflow to generate both proteomic and single-cell RNA-sequencing (scRNA-seq) data sets using tissues and cells isolated from the uteri of control and endometriotic donors. Using a linear mixed effect model (LMM), we identified proteins associated with cycle stage and disease, revealing a set of genes that drive separation across these two biological variables. Further, we analyzed our scRNA-seq data to identify cell types expressing cycle and disease-associated genes identified in our proteomic data. A module scoring approach was used to identify cell types driving the enrichment of certain biological pathways, revealing several pathways of interest across different cell subpopulations. Finally, we identified ligand-receptor pairs including Axl/Tyro3 - Gas6, that may modulate interactions between endometrial macrophages and/or endometrial stromal/epithelial cells. Analysis of these signaling pathways in an independent cohort of endometrial biopsies revealed a significant decrease in Tyro3 expression in patients with endometriosis compared to controls, both transcriptionally and through histological staining. This measured decrease in Tryo3 in patients with disease could serve as a novel diagnostic biomarker or treatment avenue for patients with endometriosis. Taken together, this integrated approach provides a framework for integrating LMMs, proteomic and RNA-seq data to deconvolve the complexities of complex uterine diseases and identify novel genes and pathways underlying endometriosis.


8
Highlights 1 9 • Leverages proteomic data to interpret and direct single-cell RNA sequencing (scRNA-2 0 seq) analysis 2 1 • Demonstrates successful use of linear mixed effects models to attribute protein 2 2 expression variance to either menstrual cycle phase or disease state 2 3 • Pathway analysis of disease state proteins guides identification of disease-relevant cell 2 4 types present in scRNA-seq data, providing foundation for mining those data for 2 5 receptor-ligand interactions of possible disease relevance 2 6 • A new potential non-hormonal target in endometriosis, TYRO3, emerges from confirming 2 7 predictions of the receptor-ligand model with transcriptomic and immunohistochemistry 2 8 analysis of an independent panel of endometrial biopsies 2 9 3 0 3 1 Endometriosis is a debilitating gynecological disorder affecting approximately 10% of the 3 3 female population. Despite its prevalence, robust methods to classify and treat endometriosis 3 4 remain elusive. Changes throughout the menstrual cycle in tissue size, architecture, cellular 3 5 composition, and individual cell phenotypes make it extraordinarily challenging to identify 3 6 markers or cell types associated with uterine pathologies since disease-state alterations in gene 3 7 and protein expression are convoluted with cycle phase variations. Here, we developed an 3 8 integrated workflow to generate both proteomic and single-cell RNA-sequencing (scRNA-seq) 3 9 data sets using tissues and cells isolated from the uteri of control and endometriotic donors. 4 0 Using a linear mixed effect model (LMM), we identified proteins associated with cycle stage and 4 1 disease, revealing a set of genes that drive separation across these two biological variables. 4 2 Further, we analyzed our scRNA-seq data to identify cell types expressing cycle and disease-4 3 associated genes identified in our proteomic data. A module scoring approach was used to 4 4 identify cell types driving the enrichment of certain biological pathways, revealing several 4 5 pathways of interest across different cell subpopulations. Finally, we identified ligand-receptor 4 6 pairs including Axl/Tyro3 -Gas6, that may modulate interactions between endometrial 4 7 macrophages and/or endometrial stromal/epithelial cells. Analysis of these signaling pathways 4 8 in an independent cohort of endometrial biopsies revealed a significant decrease in Tyro3 CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ;https://doi.org/10.1101https://doi.org/10. /2022 About once a month, in the absence of pregnancy, the human uterus completes a 1 0 1 menstrual cycle; a scarless wound healing processes in which the lining of the endometrium 1 0 2 grows an average of 1-2 centimeters in thickness before being shed during menstruation. 1 0 3 The cycle is typically broken into three phases involving the relatively stable stratum basalis 1 0 4 and the outermost layer, the stratum functionalis, which is the region of the endometrium that 1 0 5 undergoes the majority of the growth. The basalis and its resident cell populations support 1 0 6 the fuctionalis as cells proliferate during the estrogen driven proliferative phase. The rise in 1 0 7 progesterone following ovulation drives decidualization of the stromal fibroblasts, with an 1 0 8 increase in extracellular matrix (ECM) production, during the secretory phase (Chan et al., 1 0 9 2004). Differences in cellular composition, RNA expression, and cellular response to 1 1 0 stimulation in the menstrual effluent of women with endometriosis suggest that one, if not all, 1 1 1 cycle phases may be aberrant with pathology (Warren et al., 2018), but nonlinear interactions 1 1 2 throughout the menstrual cycle make it difficult to tease apart the variables . 1 1 3 The use of high-throughput biochemical assays has led to discoveries of potential 1 1 4 biomarkers for endometriosis. Proteomic datasets have been used to identify at least 20 1 1 5 proteins that are aberrantly expressed in endometriosis patients (Ferrero, 2019). These 1 1 6 proteins suggest dysregulation of the immune response as a major manifestation of 1 1 7 endometriosis and hold promise for diagnostic development, however, these methods have 1 1 8 not been implemented for clinical applications. Recently, transcriptomic analysis at single-cell 1 1 9 resolution has opened up avenues for new and data rich methods of analysis that can 1 2 0 classify cell populations, track cell states via lineage tracing, quantify cellular heterogeneity, 1 2 1 and even infer cell-to-cell interactions to better understand disease pathology (Blencowe et  1  2  2 al., 2019). However, with this wealth of information, it can be challenging to parse which cell 1 2 3 types and transcriptomic profiles are significantly associated with disease pathogenesis, 1 2 4 especially given the multiple variables (e.g., cycle phase of donor tissue, as we investigate 1 2 5 here) in addition to the presence, or not, of endometriosis. Cycle phase is also difficult to 1 2 6 control experimentally, as surgery is scheduled for availability of patient and surgeon, and 1 2 7 attempts to pinpoint a specific cycle day can be complicated by variable cycle lengths in the 1 2 8 patient, and cycles that do not conform to the idealized 28-day prototype. 1 2 9 To this end, the goal of this study was to deploy patient proteomic data, which 1 3 0 functionally integrates transcriptional-level information, to direct the analysis and 1 3 1 interpretation of single-cell transcriptomic data. For each protein, we constructed a set of 1 3 2 linear mixed effect models (LMMs) to determine whether the expression was influenced by 1 3 3 either the patient cycle phase or disease state. Next, we used pathway analysis on disease-1 3 4 relevant proteins to inform analysis of single-cell transcriptomic data, resulting in identification 1 3 5 of cell types most relevant to the disease state. Then, guided by the known cell-cell 1 3 6 communication networks in the endometrium, we performed directed cell-to-cell 1 3 7 communication analysis with the cell types we identified by scRNA-seq analysis, seeking 1 3 8 receptor to ligand interactions of possible relevance for disease phenotypes, and then 1 3 9 confirmed findings on an independent sample cohort using immunohistochemistry. Linear mixed modeling (LMM) was used on the list of proteins discovered using isobaric 2 3 5 tagging with LC/MS-MS. A total of 668 overlapping proteins were found and for each protein 2 3 6 (Supplemental Table 3), a null model and an alternative model was built using the R package 2 3 7 "lmer". Protein expression was modeled by assuming that an individual protein response 2 3 8 consisted of patient identity (Patient.ID) as a random variable and two vectors of fixed effect 2 3 9 variables: DiseaseState -having endometriosis or not -or cyclePhase -the patient falling 2 4 0 into either the proliferative or secretory phase. The two separate null models were built to 2 4 1 assess the impact of cycle phase or disease state. Alignment and cell identification. Raw sequencing data was demultiplexed and 2 7 8 aligned to the Hg19 genome using publicly available scripts on Terra 2 7 9 (scCloud/dropseq_workflow, version 11, https://app.terra.bio/). Raw sequencing data will be 2 8 0 made available for download concurrently with publication on the Gene Expression Omnibus 2 8 1 database. The resulting UMI-collapsed digital gene expression matrices were used as input 2 8 2 to Seurat (v3) for further analyses in R (v3.6). All data were normalized, the top 2,000 2 8 3 variable genes identified. Initial clustering was performed and cell types were assigned on 2 8 4 the basis of marker genes identified using Seurat's Wilcoxon rank-sum test and comparison 2 8 5 to those in the existing literature (Supplemental Table 6). After initial analyses, doublets were 2 8 6 removed using DoubletFinder following default settings (McGinnis et al., 2019), cells were re-2 8 7 clustered, and cell types were finalized based on final cluster markers (Supplemental Table  2  8  8 6). We also iteratively sub-clustered immune cells in order to refine cell type identification on 2 8 9 the basis of a new set of marker genes that was distinct from those driven by epithelial, 2 9 0 stromal, and immune cell global differences. All scripts to reproduce analyses are available 2 9 1 for download (https://github.com/bagoods). 2 9 2 MAST modeling. In order to identify differentially expressed genes in the major cell 2 9 3 populations that were driven by either cycle stage or disease, we used MAST modeling 2 9 4 (Finak et al., 2015), as this approach performs well for complex technical variables with fast 2 9 5 runtimes (Luecken and Theis, 2019). We performed differential expression on epithelial cells 2 9 6 and stromal cells separately with the following model: zlm(~CycleStage + DiseaseStatus) 2 9 7 (Supplemental Table 7 and 8). Next, for each cell type we filtered for significant genes on the 2 9 8 basis of >0.3 fractional expression and padj < 0.05 for either CycleStage (secretory vs 2 9 9 proliferative) or DiseaseStatus (endometriosis vs healthy). MAST also produces regression 3 0 0 coefficients that indicate the direction and relative magnitude of fits of the hurdle model for 3 0 1 each gene. To simultaneously explore the relationship on a gene-gene level, we used 3 0 2 ggpubr() to create scatter plots of these coefficients. All scripts to reproduce analyses are 3 0 3 available for download (https://github.com/bagoods). 3 0 4 NicheNet modeling. scRNA-seq expression data was sub-set by each patient then 3 0 5 saved as separate R objects. NicheNet (Browaeys et al., 2020) analyses were performed 3 0 6 using standard workflows on patients 259, 260, 270 and 271 since these patients had 3 0 7 enough macrophages detected to run this type of analysis. NicheNet analysis was run in two 3 0 8 modes for each patient separately. First, we wanted to determine which signals from 3 0 9 macrophages were impacting variable gene expression in either epithelial cells or stromal 3 1 0 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 1 1 cells. NicheNet's ligand-target prior model, ligand-receptor network, and weighted integrated 3 1 1 networks were imported from a publicly available database (https://zenodo.org). The gene set 3 1 2 of interest was defined by normalizing the expressed genes by the assigned receiver cell 3 1 3 type then finding the most variable features. For each patient, variable genes were identified 3 1 4 with Seurat within the largest cluster of either stromal or epithelial cells. After defining a set of 3 1 5 potential ligands using the imported network data, they were ranked by the presence of 3 1 6 genes downstream of their target receptors in the receiver epithelial or stromal cells. These 3 1 7 top-ranking ligands were then analyzed by inferring receptors and target genes to produce a 3 1 8 heatmap to show the regulatory potential between prioritized ligands and their predicted 3 1 9 target genes/receptors. All scripts to reproduce these analyses are available for download 3 2 0 (https://github.com/bagoods). 3 2 1 3 2 2 2.7. Tissue Sectioning, Staining, and Quantification 3 2 3 Histological tissue sectioning. Histological slices of the patient biopsy samples were 3 2 4 fixed with 4% paraformaldehyde upon retrieval from Newton-Wellesley Hospital. After 3 2 5 fixation, samples were dehydrated and cleared with xylenes before being paraffin embedded. 3 2 6 Slides were produced by slicing the tissue block into sections using a microtome. To prepare 3 2 7 for histological staining, the slides were deparaffinized using xylene and a reducing series of 3 2 8 graded alcohol. 3 2 9 Immunological staining. Slides were blocked for 4 hours at room temperature using a 3 3 0 solution of 3% bovine serum albumin (BSA; CAS 9048-46-8, MilliporeSigma, Burlington, MA) 3 3 1 and stained using the capture antibody from the Tyro3/Dtk DuoSet Elisa kit (Cat # DY859, 3 3 2 R&D Systems, Minneapolis, MN) at a concentration of 25 µg/mL and CD45 antibody (Cat #  3  3  3 MA5-17687, Thermofisher, Waltham, MA) using the recommended 1:500 dilution for 48 3 3 4 hours at 4˚C. Secondary antibodies anti-mouse Alexa Fluor 488 and anti-rat Alexa Fluor 546 3 3 5 (Cat #s A-21202 and A-11081, Thermofisher, Waltham, MA) were used at a concentration of 3 3 6 10 µg/mL along with 4',6-Diamidino-2-Phenylindole, Dihydrochloride (DAPI; Cat # D1306, 3 3 7 Thermofisher, Waltham, MA) to stain for nuclei. Slides were washed with 1X phosphate 3 3 8 buffered saline (PBS; Cat # 10010023, Thermofisher, Waltham, MA) and stained for 3 hours 3 3 9 at room temperature. Slides were then washed again with 1X PBS and imaged using a 3 4 0 Keyence BZ-X700 series microscope using a 10x objective. 3 4 1 Image quantification. Images were quantified using CellProfiler (McQuin et al., 2018) to 3 4 2 identify the number of CD45+ cells and the total area that stained positive for Tyro3. 3 4 3 Brightfield images were used to determine the area of each tissue sample and this 3 4 4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
In order to develop a more complete proteomic and transcriptomic picture of the healthy and 3 5 1 diseased uterus, we established a workflow to profile human endometrial samples. Endometrial 3 5 2 biopsy samples were collected from patients undergoing hysterectomy. Prior to uterine tissue 3 5 3 removal, pipelle biopsies are taken, primarily from the fundus region of the uterus. These 3 5 4 biopsies were then digested as described above to prepare samples for scRNA-seq and 3 5 5 proteomic ( Figure 1A) analyses. Patients without a history or diagnosis of endometriosis were 3 5 6 used as control subjects while diseased patients presented with symptoms of endometriosis 3 5 7 with an official diagnosis made during or after surgery. In total, we profiled 11 total patients 3 5 8 using scRNA-seq (n=6) and proteomics (n=8). For our proteomic data, we first performed 3 5 9 principal component analysis across 668 proteins. Displaying this data along the first and 3 6 0 second dimensions revealed little global separation on the basis of cycle phase and of disease 3 6 1 state of the patient samples ( Figure 1B). There is minor separation of patient disease state 3 6 2 along principal component (PC) 1, driven by differences in extracellular matrix composition 3 6 3 since COL3A1, COL6A1, and RACK1 were in the top 10 of PC1 contributors.

6 4
We generated high quality scRNA-seq data on a total of six patients (3 control, 3 3 6 5 endometriosis) and 6,659 single cells, and identified major epithelial, stromal, and immune 3 6 6 populations across all six donors (Supplemental Figure 1). Dimensionality reduction of the 3 6 7 scRNA-seq data using PCA followed by uniform manifold approximation and projection (UMAP) 3 6 8 on dimensions 1 and 2 highlights populations of cells that cluster strongly by both cycle phase 3 6 9 (center, Figure 1C) and disease state (left, Figure 1C). Taken together, our data suggest that 3 7 0 both cycle and disease state drive global variations in both data types, and suggests that 3 7 1 analytical approaches that leverage both these variables are essential for deeper analyses. 3 7 2 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Endometrial biopsies are collected via pipelle from women undergoing uterine surgery. The 3 7 5 sample is split, with part used to isolate single cells for scRNA-seq analysis using Seq-Well, and 3 7 6 the remainder digested and analyzed for proteomics using LC-MS/MS. B. A PCA plot of the 3 7 7 proteomics samples show poor clustering based on disease state with cycle phase as a 3 7 8 confounding factor (n = 8 donors, Supplementary Table 1). C. ScRNA-seq data (n = 6 donors, 3 7 9 6,659 cells) shown as a UMAP projection colored by cell identities (left), cycle stage (middle) or 3 8 0 disease status (right). 3 8 1 3 8 2 Identification of cycle phase and disease associated proteins. Initial unsupervised 3 8 3 clustering methods for both the proteomic and scRNA-seq data highlighted the difficulties of 3 8 4 having both menstrual cycle phase and disease state as variables of interest. We therefore 3 8 5 used linear mixed effect modeling (LMM) in order to tease apart the effects of these two 3 8 6 biological variables in our proteomics results. Using a LMM, we attempt to capture the majority 3 8 7 of variance in the system using either fixed or random variables and then by creating a null 3 8 8 model to capture for comparison we isolate the variance attributed to the variable of interest. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.29.22269829 doi: medRxiv preprint right). This suggests that the use of LMM was successful in deconvolving the impacts of cycle 4 0 1 phase and disease state, revealing a set of proteins associated with each to investigate further. 4 0 2 While numerous protein differences among samples occur as a result of the natural menstrual 4 0 3 cycle, an independent set of proteins are associated with the disease state, and a subset of 4 0 4 proteins is significantly affected by both cycle phase and disease state. 4 0 5 In order to determine which signaling pathways might be influenced by either cycle stage 4 0 6 or disease status, we performed pathway analysis on the LMM protein lists using the software 4 0 7 gProfiler (Raudvere et al., 2019). We identified hundreds of pathways that were altered across 4 0 8 both protein lists (Supplemental Table 4). To better understand the broader implications of these 4 0 9 results, we classified these pathways based on each GO terms description: immune response, 4 1 0 extracellular vesicles, cellular transport, cellular binding, and cellular metabolism. We show the 4 1 1 top ten scoring pathways in each broad category ( Figure 2C). These data suggest that 4 1 2 pathways significant to cycle phase skew significantly towards cellular transport pathways while 4 1 3 disease-related pathways are notably higher in the immune response, extracellular vesicles, 4 1 4 cellular binding, and cellular metabolism categories ( Figure 2C). Taken together, our analyses 4 1 5 suggest that we are able to effectively tease apart differential proteins and associated biological 4 1 6 functions that differ as a function of cycle stage or disease status using this LMM approach. 4 1 7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. cut, blue points are above the -log 10 of the p-value cut off and red points are above both the -4 2 2 log 10 of the p-value and log 2 fold change cut offs. B. LMM is used to identify proteins that are 4 2 3 variable either in disease state or with cycle phase. In the LMM for each protein, the null model 4 2 4 excludes the variable of interest (disease or cycle phase) and is compared against the full 4 2 5 model to identify proteins with a significant impact on the alternative model variable. PCA plots 4 2 6 using the proteins identified in the two different models show better separation of both groups. 4 2 7 C. A heatmap of functional enrichment results on protein lists identified as significant in disease 4 2 8 or cycle phase. 4 2 9 5 le er ge re el ull ts s. se . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

6
Multiple cell types are involved in expression of proteomics-identified genes. We 4 3 0 next sought to leverage our scRNA-seq data set to better understand our proteomics results by 4 3 1 determining which cell types were expressing cycle or disease-defined transcripts. We found 4 3 2 that cycle-associated genes, including TPM2, TPM1, COL4A2, and TNS1, were predominantly 4 3 3 expressed in smooth muscle cells ( Figure 3A). We also found that immune related genes, such 4 3 4 as HLA-A, were expressed in macrophages and NK cells. For those proteins that were identified 4 3 5 as uniquely associated with disease, we found that many different cell types expressed the 4 3 6 corresponding genes ( Figure 3B). These included expression of IDH2 across several different 4 3 7 clusters of fibroblasts and epithelial cells, high expression of SOD2 in epithelial cells, and high 4 3 8 expression of LCP1 in macrophages and NK cells. We additionally use a MAST (Finak et al., 4 3 9 2015) modeling approach to identify genes associated with both cycle and disease in epithelial 4 4 0 cells and stromal cells since these were the largest cell populations in our dataset ( Figure 3C, 4 4 1 Supplemental Table 7 and 8). We identified many genes that were significantly differentially 4 4 2 expressed across both disease and cycle, including many that were found in our proteomi data. 4 4 3 Comparing our disease-associated proteins with the corresponding lists in our scRNA-seq data, 4 4 4 we found there were 71 common elements in epithelial cells and 69 in stromal cells. Similarly, 4 4 5 for cycle-associated elements, we found 15 common for epithelial cells and 20 common for 4 4 6 stromal cells. Overall, this suggest that many different cell types are contributing to disease and 4 4 7 cycle associated alterations in the uterus, and by combining both scRNA-seq and proteomics, 4 4 8 we can identify high-confidence lists of genes of interest.

9
In order to look at these lists more wholistically, we plotted the MAST regression 4 5 0 coefficients for cycle stage and disease for either epithelial cells or stromal cells ( Figure 3C). 4 5 1 These plots allow us to identify genes that significantly contribute to disease or cycle and the 4 5 2 magnitude of contributions. For example, we see that for epithelial cells, UBC, SOD2, VCAN, 4 5 3 and PAEP were found to be significantly positively associated with cycle stage and disease, 4 5 4 while COL1A2 and PGR were both negatively associated. We additionally can see that there 4 5 5 are fewer genes associated positively with disease and cycle stage in stromal cells, but they 4 5 6 include several heat shock protein genes and LGALS1. The long non-coding RNA H19 was 4 5 7 found to be associated positively with cycle stage in stromal cells. Taken together, these plots 4 5 8 allow us to further identify genes associated with cycle stage and disease in the uterus and 4 5 9 serve as a roadmap for deconvolving the complex interplay between genes associated 4 6 0 significantly with both. 4 6 1 4 6 2 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Enrichment of proteomics-identified pathways across scRNA-seq data. Next, we 4 7 1 wanted to determine which cell types in our scRNA-seq data were enriched for top-scoring 4 7 2 pathways identified in our proteomics data. To do this, we created module scores for 4 7 3 proteomics-identified pathways and performed PCA on the resulting scores ( Figure 4A). We 4 7 4 7 eq as as ng p e ng for e . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.29.22269829 doi: medRxiv preprint 1 8 found that we could separate cell types along PC1 or PC2 based on each cell type's high-4 7 5 scoring pathways. We found that macrophages separated from the rest of the cells along PC2, 4 7 6 where the top pathways driving this separation were immune in nature, including myeloid 4 7 7 leukocyte mediated immunity ( Figure 4B), neutrophil activation, and leukocyte mediated 4 7 8 immunity. We additionally found that there was a spectrum of epithelial and stromal cells, with 4 7 9 the epithelial 4 cluster and PAEP+ epithelial 2 cluster separating along PC1. This was 4 8 0 predominantly driven by selanoamino acid metabolism and regulation of protein transport 4 8 1 ( Figure 4B). This suggests that there might be different functions of these epithelial and stromal 4 8 2 subsets, and in light of other published reports (Tan et al., 2021;Wang et al., 2020), suggests 4 8 3 that proteomics coupled with scRNA-seq can help to refine these phenotypes. Taken together, 4 8 4 our data suggest that this approach can provide useful context for interpreting long lists of 4 8 5 pathway enrichment results from proteomics data and also identify subsets of cells, like stromal, 4 8 6 epithelial, and macrophages that drive these results. 4 8 7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Mining interactions between stromal cells or epithelial cells and macrophages
and scRNA-seq analyses suggested that macrophages, stromal cells, and epithelial cells have 4 9 5 different roles in cycle and disease-related uterine biology, we wanted to determine how cell-cell 4 9 6 interactions might influence these processes. To do this, we used NicheNet, (Browaeys et al., 4 9 7 2020) a cell-to-cell ligand-to-receptor interaction inference approach, to identify receptors and 4 9 8 ligands involved in specific interactions identified strongly with disease state (see Methods). We 4 9 9 9 ts ng op es ic ve ell l., nd e . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.29.22269829 doi: medRxiv preprint 0 performed this on a patient-by-patient basis in order to more accurately identify interactions 5 0 0 between macrophages and stromal or epithelial cells in a subset of patients that had enough 5 0 1 macrophages present in the dataset (see Methods). We identified several major receptor and 5 0 2 ligand interactions that may be altered in the context of disease, health, and as a function of 5 0 3 patient. For interactions between macrophages and epithelial cells, we found several receptor-5 0 4 ligand pairs with high interaction potentials that were conserved across all patients, including 5 0 5 IL1B-IL1R, and several that were distinct across patients, such as APP interactions with various 5 0 6 receptors (Supplemental Figure 2). Interestingly, for macrophage and stromal cell interactions 5 0 7 ( Figure 5), we found that Axl-GAS6 receptor-ligand pair scored highly across all patients. We 5 0 8 previously found in our MAST modeling results that Axl was differentially expressed in both 5 0 9 cycle and disease conditions in stromal cells as well (Supplemental Table 8). Axl is part of group 5 1 0 of tyrosine kinase transmembrane receptors which also includes MERTK and Tyro3, collectively 5 1 1 known as "TAM" receptors. They are known for their roles in efferocytosis and immune 5 1 2 checkpoint blockades (Lemke, 2019; Lemke and Rothlin, 2008; Vago et al., 2021). Interestingly, 5 1 3 we also see Gas6-Tyro3 binding between macrophages (Gas6) and stromal cells (Tyro3) as 5 1 4 important cell to cell interactions, though this ligand receptor pair is less well studied compared 5 1 5 to Axl. Taken together, this data suggests that macrophage interactions with stromal cells, as 5 1 6 mediated by Axl and Tyro3 signaling may play a key role in uterine cycle and disease. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.29.22269829 doi: medRxiv preprint 2 5 4 1 Gas6 and Tyro3 expression across all epithelial cells. P values were derived from MAST 5 4 4 modeling (Supplemental Table 7). B. Gas6 and Tyro3 expression across all stromal cells. P 5 4 5 2 no . T P . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.29.22269829 doi: medRxiv preprint 4 2012) we sought to identify differentially expressed genes that were also found in our 5 7 8 proteomics LMM results. We additionally used pathway analysis to bridge this gap, and found 5 7 9 that we could nominate cell types that were potentially driving enrichment results. This further 5 8 0 allowed us to narrow down on cell types for further investigation, focusing on macrophages, 5 8 1 stromal cells, and epithelial cells.

8 2
Many of the disease relevant proteins, when projected to the single-cell transcriptomic data 5 8 3 were expressed in macrophages, epithelial cells, and stromal cells over varying phases of the 5 8 4 menstrual cycle ( Figure 3B). Previous studies have also identified these cells types to be 5 8 5 important in endometriosis as both the immune system and endogenous endometrium have 5 8 6 been implicated in the disease. It was recently discovered that macrophages from people with 5 8 7 endometriosis have increased expression of SIRP-α suggesting that these macrophages may Additionally, our MAST modeling approach suggests that there are many genes that contribute 6 0 1 to both cycle and disease-related attributes ( Figure 3D). Here, we have provided an approach 6 0 2 and combined dataset that has nominated many more genes that require further study and 6 0 3 validation. 6 0 4 With the focus on three identified cell types and the pathway enrichment score from the 6 0 5 proteomics data suggesting that cellular crosstalk is upregulated in the endometrium of patients 6 0 6 with endometriosis, we analyzed ligand receptor interactions of macrophages to epithelial cells 6 0 7 and stromal cells (Figure 4). For the four patients analyzed, the consistently strongest ligand 6 0 8 receptor pairs were found in the macrophage to stromal cells interaction with macrophages 6 0 9 expressing growth arrest -specific 6 (Gas6) and stromal cells expressing the receptor tyrosine 6 1 0 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ;https://doi.org/10.1101https://doi.org/10. /2022 kinases Axl and Tyro3. These two receptors are part of a group of transmembrane receptors 6 1 1 which, together with MERTK, make up the TAM (Tyro3, Axl, and MERTK) receptors that are 6 1 2 known for their roles as immune check point inhibitors and in efferocytosis.(Lemke and Rothlin, 6 1 3 2008) The TAM receptors are known to bind to both Gas6 and Protein S, (Nagata et al., 1996) though Gas6 has the highest affinity for Axl while Protein S has the highest affinity to Tyro3.(van 6 1 5 der Meer et al., 2014;Nakamura et al., 1998) Further inspection of the transcriptomic data did 6 1 6 not show a notable trend with Gas6 or Axl expression, but expression levels for Tyro3 were 6 1 7 decreased in patients with endometriosis ( Figure 6) and Protein S followed a similar 6 1 8 transcriptomic profile to Tyro3 (Supplemental Figure 3). An increase in TAM receptor expression 6 1 9 with inflammation as an anti-inflammatory response(Lemke, 2013) and increased TAM 6 2 0 expression would presumably help with phagocytosis in the endometrium, however, both Tyro3 expression in patients with endometriosis compared to controls could be used as a 6 2 8 biomarker -with lack of Tyro3 indicating an increased risk of endometriosis. Furthermore, this 6 2 9 key inference highlights the TAM receptors as potentials treatment avenues for patients 6 3 0 suffering from or prone to endometriosis as an increase in TAM receptor signaling in diseased 6 3 1 patients may promote a more effective immune response to menstrual effluent. 6 3 2 Follow up studies to explore TAM receptors and their potential role in endometriosis could 6 3 3 include additional patient data sets to gain more granular insights in cell expression across the 6 3 4 phases of the menstrual cycle, as our limited single cell transcriptomic data suggests that Tyro3 6 3 5 expression may be cycle phase dependent. The striking difference in observed Tyro3 6 3 6 immunohistochemistry outcomes between control and endometriosis patients -where the 6 3 7 greatest observed staining values in endometriosis patients are still below the lowest values in 6 3 8 controls -could arise from a combination of transcription and post-translational events such as 6 3 9 proteolytically-mediated shedding. Tyro3 undergoes enhanced shedding, producing prodigious 6 4 0 amounts of soluble receptor, in rheumatoid arthritis (Vullings et al., 2020). These mechanisms 6 4 1 could be further parsed in 3D co-cultures, building on advances in 3D tissue engineering models 6 4 2 that incorporate the relevant cell types in synthetic ECMs (Below et al., 2022; Gnecco et al., 6 4 3 2021). 6 4 4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. ; https://doi.org/10. 1101/2022 In summary, we used linear mixed effect modeling to decipher the effects of menstrual cycle 6 4 5 phase and disease state and used these identified proteins to discover pathways specific to 6 4 6 endometriosis and cycle phase. This pathway analysis then guided single-cell transcriptomic 6 4 7 analysis to parse dysregulated cell-to-cell interactions between macrophages and stromal cells. 6 4 8 The TAM receptor Tyro3 was shown to be downregulated in patients with endometriosis, at both 6 4 9 the transcriptomic and protein level, paving the way for further analysis on the use of Tyro3 as 6 5 0 both a diagnostic and treatment pathway. Therapeutics Limitied, Empress Therapeutics, FL82, and Dahlia Biosciences, unrelated to this 6 6 3 work. 6 6 4 6 6 5 6 6 6 6 6 7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 1, 2022. tissues. BioRxiv 2021.07.28.453839. 8 1 5 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Receptors (stromal cells)
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.