The local and systemic response to SARS-CoV-2 infection in children and adults

While a substantial proportion of adults infected with SARS-CoV-2 progress to develop severe disease, children rarely manifest respiratory complications. Therefore, understanding differences in the local and systemic response to SARS-CoV-2 infection between children and adults may provide important clues about the pathogenesis of SARS-CoV-2 infection. To address this, we first generated a healthy reference multi-omics single cell data set from children (n=30) in whom we have profiled triple matched samples: nasal and tracheal brushings and PBMCs, where we track the developmental changes for 42 airway and 31 blood cell populations from infancy, through childhood to adolescence. This has revealed the presence of naive B and T lymphocytes in neonates and infants with a unique gene expression signature bearing hallmarks of innate immunity. We then contrast the healthy reference with equivalent data from severe paediatric and adult COVID-19 patients (total n=27), from the same three types of samples: upper and lower airways and blood. We found striking differences: children with COVID-19 as opposed to adults had a higher proportion of innate lymphoid and non-clonally expanded naive T cells in peripheral blood, and a limited interferon-response signature. In the airway epithelium, we found the highest viral load in goblet and ciliated cells and describe a novel inflammatory epithelial cell population. These cells represent a transitional regenerative state between secretory and ciliated cells; they were found in healthy children and were enriched in pediatric and adult COVID-19 patients. Epithelial cells display an antiviral and neutrophil-recruiting gene signature that is weaker in severe paediatric versus adult COVID-19. Our matched blood and airway samples allowed us to study the spatial dynamics of infection. Lastly, we provide a user-friendly interface for this data as a highly granular reference for the study of immune responses in airways and blood in children.

SARS-CoV-2 employs a host cell surface protein, angiotensin-converting enzyme (ACE) 2, as a receptor for cellular entry 11 . A number of recent studies have suggested that the expression of ACE2 is both tissue and age-dependent 12,13 , and differences in ACE2 expression between children and adults were proposed to contribute to less severe disease in children. At single cell level, viral entry genes were shown to be most highly expressed in the nasal epithelium in healthy adults 14 . In children, bulk RNA sequencing analysis revealed lower ACE2 gene expression both in upper 15 and lower airways 13,16 compared to adults, although more recent studies found no correlation with age or infection 17 .
Although these specific cell types are beginning to be resolved in adults [19][20][21]20,22,24,25 , they have not been as comprehensively characterised in children. So far, only bulk RNA sequencing and cytokine studies have compared immune responses between children and adults, suggesting a more robust innate immune response, such as increased levels of interferon gamma (IFN-γ) and interleukin-17 (IL-17A) in plasma 26 , and a reduced antibody response and neutralising activity against SARS-CoV-2 in children 27 .
The immune landscape in early life has been shown to be distinct from that of adults 28,29,30 , in keeping with the immune system maturing from an immune tolerant state in utero to a more pro-inflammatory state with the increased exposure of new antigens and pathogens over the years 31 . Changes in blood cell counts and immune cell composition throughout childhood are well described 32 , and recently CyTOF (cytometry by time-of-flight) has facilitated a higher multiplexed panel-dependent description of cell types, with studies focussing on either early childhood 29 or reporting comparison to disease 33 . Unbiased transcriptomic analysis of healthy paediatric blood immune cell types over the entire duration of childhood is lacking. Similarly, the upper airway mucosa only reaches maturity after puberty, in keeping with the high incidence of upper respiratory tract infections in children 34 , yet unbiased analysis of epithelial maturation is still missing. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint SARS-CoV-2 by RT-qPCR were enrolled in the study, and symptom onset relative to RT-qPCR testing and sampling is summarised in Extended Data Figure 1a. For some patients blood was additionally sampled on the day of hospital discharge (labeled convalescent) and our cohort included patients who were recalled 3 months after recovering from  requiring respiratory support at the time of infection (non-invasive or invasive ventilation), labeled as post-COVID (Extended Data Figure 1a-b). Patient characteristics and metadata are in Extended Data Table 1a. This includes ethnicity data, but we also used our data set to infer ethnicities and found that 48% of our cohort were of European, 16% of Asian, 10% of African and 26% of mixed descent (Extended Data Figure 1c). Nasal, tracheal and bronchial brushings were freshly processed by cold digestion and analysed by Chromium  (Figure 1e). The full dataset is available at https://www.covid19cellatlas.org/ for easy browsing and interactive data analysis.

Detection of SARS-CoV-2 reads
When aligning transcriptomes, we included the SARS-CoV-2 reference genome, and detected viral reads (> 10 reads) in 4 individuals (3 nasal and 1 bronchial). In the upper airways, we detected the highest levels of total viral expression in the two most abundant cell types: goblet 2 and ciliated 2 cell populations (Figure 1f). When examining the percentage of cells with viral reads, fractions were similar across all nasal epithelial cell types (Figure 1g), which agrees with the expression of ACE2 in our data set (Extended data Figure 3a). Viral reads were also detected in lymphocytes and myeloid cells (mostly macrophages), either reflecting active infection in macrophages 35 or merely the uptake of virions or infected dead cells.
These findings are consistent with previous studies that reported highest levels of expression of ACE2 and TMPRSS2 in goblet and ciliated cells in healthy individuals 14 . Additional genes . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint (NRP1 36,37 , BSG 38 , TFRC 39 ) have been implicated as secondary entry receptors (Extended Data Figure 3a) but in our data set of three individuals with high viral reads, only some correlate with actively infected cells (Extended Data Figure 3b). It has been reported that in adults, expression of ACE2 is induced by interferon 40 and in response to infection 25 . In contrast, we observe no significant increase of ACE2 expression across cell types in children with COVID-19 (Figure 1h), consistent with recent bulk RNAseq comparisons 17 . However, we do see a difference in ACE2 expression in adults, with higher expression in acute versus post-COVID patients (Extended Data Figure 3c). In line with previous reports 41,42 , no viral reads were detected in peripheral blood within our COVID-19 cohort.

Nasal, tracheal and bronchial epithelial cell analysis
We first focused our analysis on the epithelial cell compartment in the airways. Using unsupervised clustering and re-clustering (see methods) we detect 15 distinct epithelial populations (Figure 2a) based on known marker gene expression [43][44][45] and indicate their distribution within the UMAP with respect to COVID-19 status and age (Figure 2b). In our cell type annotation, we used a conservative approach, using a high cut-off of 1000 UMIs per cell and normalising the data set by donor identity which ensures that cell type differences are consistent between individuals. We identified a small number of melanocytes, in keeping with previously reported primary nasal mucosal melanocytes 46,47 . The major surface epithelial cell populations fall into two broad clusters, with the first covering the basal to secretory cell differentiation pathway, as visualised using Velocyto (Figure 2c), and these include basal 1, basal 2, cycling basal, secretory, goblet 1 and 2, and squamous cells with markers as specified in is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint secretory cluster from basal towards squamous cells (Extended Data Figure 4a). Within the ciliated cluster, the differentiation trajectory points from ciliated 1 (PIFO, OMG) to ciliated 2 (CFAP54, DZIP1L) cells. Between the secretory and ciliated clusters, we observe deuterosomal cells as intermediates which are marked by CDC20 and FOXN4 43,44 (Figure 2c). In addition, we also detect a novel, second cell population that forms a second bridge between the secretory and ciliated clusters. We refer to these as inflammatory epithelial transit cells (IETCs) as they are high in inflammatory markers such as S100A8 and S100A9. They co-express ciliated cell markers such as PIFO and express secretory genes such as MUC2, but FOXJ1 expression is low. Distinguishing markers are two long non-coding RNAs, FP671120.4 and FP236383.2.
This population has not been described in previous analyses of adult airways 44,43 . We detect these cells mostly in COVID-19 patients, but also in healthy children (Figure 2b), suggesting a function in development and tissue regeneration following injury (see below). We also detect rare cell types such as ionocytes, brush cells and neuroendocrine cells, each expressing their canonical markers 43,44 .

Changes of epithelial cell type proportions in healthy and COVID-19 individuals
We next examined the epithelial cell type proportions in our cohort (Figure 2e,f). To test significance, we used a multivariate random effect model (see methods) that allowed us to take into account clinical metadata, such as age, sex, ethnicity, tissue, COVID-19 status and six technical factors (Figure 2f Conversely, there is a decrease in club cells in the nose. Statistical significance is indicated by local true sign rate (LTSR), which was greater than 0.999 for the latter observation.
Contrasting COVID-19 versus healthy patients, the most highly enriched cell types are inflammatory epithelial transit cells (IETCs), goblet 2 and ciliated 1 cells. In contrast, basal 1 cells are notably reduced. Ciliated and goblet cells contain the highest number of viral reads (Figure 1f), most likely leading to cell death 51,50 . We hypothesize that increased ciliated 1 and goblet 2 cell numbers reflect an ongoing compensatory replacement of these dying cells by their precursors to maintain homeostasis as seen in lower airways upon infection 52,53 .
Differentiation trajectory analysis suggests that the IETCs serve as an alternative differentiation path for repopulation of ciliated cells (Figure 2c), leading to a depletion of basal . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint 1 cells, the main stem cell population within the airway epithelium. This is further supported by the finding that in post-COVID patients the basal 1 population recovers (Figure 2f).

Weak interferon response in children with severe COVID-19
In adults viral infection is strongly associated with an interferon response 54 and for SARS-CoV-2 a higher response to interferon has been reported for mild and moderate disease than for severe disease 50,55,56 . We therefore carried out differential gene expression analysis (using a multivariate random effect model; see methods) between healthy controls and COVID-19 patients, followed by a GO term enrichment analysis. For the strongly enriched inflammatory epithelial transit cells (IETCs) the DE genes in healthy versus diseased patients were enriched for interferon responsive genes (Figure 2h) and were associated with an interferon type I, antiviral and cytokine response. When testing interferon-alpha and gamma response signatures across epithelial cell types, expression was highest in secretory and goblet cells (Extended Data Figure 5a). Healthy children showed an interferon response that increased with age (Figure 2g). A comparison across conditions (Figure 2g) showed that the interferon response in adult COVID-19 patients was higher than in children with COVID-19. As recent bulk RNAseq data suggest that interferon responses correlate with viral load, but not age 17 , we interpret this to indicate that like in adults, severe COVID-19 disease in children is associated with reduced interferon responses.

A neutrophil recruiting gene signature associated with COVID-19
In addition to the above analysis, we also compared gene expression in inflammatory epithelial transit cells (IETCs) without correcting for age (Figure 2h) and saw upregulation of innate immune response genes with high expression of serum amyloid protein (SAA1), an acute phase protein highly expressed in inflammation and tissue injury 57 , and S100A8 and S100A9 which together encode calprotectin. Calprotectin is also highly expressed in myeloid cells and has been identified as a key mediator of mild versus severe COVID-19 58 . Using this gene set in GO term analysis highlights neutrophil enrichment as the top enrichment term (Figure 2h), a striking finding given that neutrophil infiltration is linked to COVID-19 disease severity 59 . As calprotectin expression has primarily been associated with myeloid cells, we wished to validate expression at the protein level in epithelial cells. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Comparison to transitional populations in different locations
Previous studies of adult nasopharyngeal swabs had found a population of interferon responsive cells (IRC), as an intermediate between secretory and ciliated cells 25 . We therefore compared the "transitioning" populations in the Chua et al data set with those in our upper airway (nose and trachea) and bronchial cells (Extended Data Figure 6). All locations contained equivalent deuterosomal cells. The IRC makers ISG15, IFIT1 and CXCL10 were widely expressed in all three data sets, but only associated with transitioning cells in the Chua et al data set. The inflammatory epithelial transit cell (IETC) markers SAA1, S100A8 and S100A9 were found in transitioning cells in both upper airway data sets, but less obviously in transitioning populations in the bronchial data set (Extended Data Figure 6). Thus, transitioning cells are likely to be plastic in their phenotype 48 with exact gene expression signatures depending on the site of sampling, the interval between disease onset and sampling and other clinical covariates.

Immune cell states in healthy paediatric blood defined by multi-omic data
Next we examined the cellular landscape of the healthy developing immune system in both blood and the upper airways in children. We generated CITE-seq data for 51 blood samples using a 192 antibody panel recognising 188 proteins. By integrating gene and protein expression data using an integrated Weighted Nearest Neighbor analysis 60 and Leiden clustering 61 , we were able to annotate our 317,854 single cells into 31 cell types (Figure 3a, quality control metrics in Extended Data Figure 7). While the gene expression data made a greater contribution to distinguishing cell types, the protein expression data strongly contributed to defining T cell subtypes and monocytes, as visualised by pie charts in Figure   3a. Notable genes for which protein expression was more informative than RNA expression levels included CD4 in T cell subsets, NCR1 and CD56 (NCAM1) in NK cells, and CCR6 and IGHD in naive and memory B cells (Figure 3c, Extended Data Figure 8a). CITE-seq also allowed us to distinguish between cell type-specific protein isoforms, including CD45RA and CD45RO, which are key markers of T cell maturity. While RNA marker genes were consistent with previous reports 24 , the relative contribution of protein data varied significantly between cell types (Figure 3a, Extended Data Figure 8a). For example, for T cell subsets transcriptional data carried only slightly higher weight than the protein data, whilst plasmablasts, cycling T and NK, and haematopoietic stem and progenitor cells (HSPCs) were almost entirely assigned on the basis of RNA expression. To validate our manual cell type annotation and assess its accuracy, we compared it to an automated annotation by Azimuth 62 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint This revealed that our annotation was highly similar, but not identical to automated annotation (Extended Data Figure 8b).
Our data set allowed us to track changes in gene expression within cell types with age. In particular we noticed that both naive CD4 + and CD8 + T cells displayed a developmental pattern of gene expression (Figure 3b,d). A number of genes expressed in the neonatal age group were associated with cell proliferation or cancer (e.g. CDCA7, BCL11A), possibly reflecting a higher proliferative capacity in these early cells. The presence of GZMA (granzyme A) in both CD4 + and CD8 + naive T cells and NKG7 (Natural killer cell granule protein 7) in only the CD8 + cells may suggest that these early naive T cells have specialised immune functions, with potentially higher cytotoxic activity. Other genes regulated over developmental time included those with potential immune regulatory functions such as HPGD and TOX, with possible roles in Treg and T cell development, respectively 63,64 .
We also observed age-related gene expression gradients in naive B cells and to a lesser extent in monocytes (Figure 3b) and carried out a similar trajectory analysis (Extended Data Figure   8c). There were clear changes observed for B cells, but less obvious changes for monocytes.
Genes with changing expression over developmental time in naive B cells have functions connected to differentiation, regulation, glycosylation and cell adhesion (including ILR4, KLF9, ADAM28, HMGA1, LARGE1 and ITM2C). In addition, markers that point to a more innate-like type of immune response, such as CD9 and CD1c, which are associated with the marginal zone (MZ) B cell subtype 65 , are expressed at a higher level in neonates. Such changes in the characteristics of circulating B cells at a very young age might also be linked to the formation of structures like the marginal zone, which is not fully developed until 2 years of age 66 . Together our data suggests that naive T and B cells that have already entered circulation continue to develop and mature during the first year after birth.

Immune cell proportions in blood and nose throughout childhood and in adults
We further investigated changes in immune cell type proportion over childhood and in disease in blood and nose (Figure 3e-h), using the same statistical model as described above. In this analysis all samples were considered simultaneously while accounting for technical variation, and effects for age and COVID-19 status were plotted for the paediatric and the adult cohort is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint healthy neonates, we observed high proportions of HSPCs, however this was driven by a single individual (Figure 3g). Fractions of 'Cycling T and NK', ILCs and NK CD56hi were also high in neonates but reduced in infants. Levels of RBCs and platelets are shown but may be confounded by technical variation. Relative levels of naive CD8 + and CD4 + T cells and Tregs were high in neonates and infants and then decreased with age, with very low levels in the elderly (adults >65 years), consistent with ongoing immune stimulation over the life span.
Proportions of conventional dendritic cells (cDCs) type 1 and 2, Tregs, and naive B cells increased from neonates to infants, but then decreased with age. In contrast cell types associated with an adaptive immune response such as CD8 + cytotoxic T lymphocytes (CTL),

Changes in cell type proportions in blood in COVID-19
We next compared the relative cell type proportions in relation to COVID-19 (Figure 3b).
Strikingly and in contrast to adult COVID-19, in paediatric patients with severe COVID-19 the proportion of naive B or T cells, Tregs and ILCs is increased (Figure 3e,g), on a background of little change in the average lymphocyte counts (Extended Data Table1), consistent with a previous report that COVID-associated lymphopaenia is less pronounced in children 67,68,69 .
High numbers of naive cells may be due to increased release of immature B and T lymphocytes from the bone marrow and thymus respectively, or due to migration of more mature cells to is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint monocytes, and was found to be lower in the paediatric than the adult COVID-19 cohort (Extended Data Figure 5b,c). Taken together, we see large differences in SARS-CoV-2 immune responses between children and adults. Some changes, such as in Tregs, are disease specific, whereas others (naive T lymphocytes, monocytes) are associated with both disease and age or age only (e.g. increases in cytotoxic T cells). The response to SARS-CoV-2 in adults versus children thus strongly reflects the underlying changes in the immune landscape that we observe over the life span.

The immune landscape in the upper airways with age and COVID-19
When investigating the immune cell proportions in the nose (Figure 3f) we again observed significant changes over the life span. Because the total number of immune cells retrieved in airway samples was far lower than for blood and no protein expression data was available, we When examining cell type proportions in the nose with age, the pattern of cell type distribution differed greatly between the airways and the blood. In the upper airways, healthy neonates had high proportions of activated macrophages, CD14 + monocytes, cDC2 and neutrophils that rapidly decreased with age. Young children had higher levels of CD8 + T effector, g/d and MAIT cells, that remained relatively high throughout childhood. Within the airway samples, a comparison of adolescents to adults and elderly revealed relatively few striking changes. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint When comparing COVID-19 to healthy cells, we first examined differences separately, within the paediatric cohort and within the adult cohort. Mostly, changes were similar and we therefore aggregated the two data sets. The combined analysis showed that in the airways COVID-19 samples were enriched for activated macrophages, neutrophils and CD8 + T effector cells (Figure 3h). These changes are consistent with those described in other studies 20,25 , but strikingly different to the changes we observe in the same individuals in blood (see below).

Correlation of immune cell types across nose and blood
In our data set, we have the unique opportunity to compare cell type proportions between the upper airways and blood from the same individuals and assess how the relationship between these compartments changes over childhood and with disease. We first integrated the immune cell data set from the different sites ( Figure 4a) and then examined how the cell type proportions change with disease or age (within the healthy paediatric data set) across the nose and blood (Figure 4b). As not all cell types were represented in both data sets labels on the x and y axis are not symmetrical. In healthy children, samples with highest T effector cells, Overall, our findings highlight the need to study the sites of infection such as the upper airways as well as blood in order to fully understand the immune response to infections.

Lymphocyte Clonality
From our 5' 10X single cell libraries we amplified T cell receptor (TCR) and B cell receptor (BCR) sequences to analyse clonality. We analysed the VDJ data using scirpy (see methods) and found productive single pair TCRa/b chain rearrangements for all T cell subtypes, but also . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint In line with previous reports, we found that in adults the most highly expanded clones are CD8 + CTLs and CD8 + effector memory T (Tem) cells 75 as well as CD4 + CTLs (Figure 4d).
Consistent with expectations we see shared TCRs between CD8 + CTLs and CD8 + Tem and CD8 + Tcm; and separately between CD4 + CTL and CD4 + Tcm. In comparison to adults, we note that in the paediatric data individual clones were much smaller, and that in our cohort of patients the majority of expanded clones were from healthy children. Expanded clones were mostly CD8 + CTL and CD8 + T effector cells with fewer CD8 + Tem or CD8 + Tcm cell clones We also studied expansion of B cell clones (see methods) (Extended Data Figure 10 e-h) and observed a very similar pattern. The number of expanded B cell clones increased with age and for the severe COVID-19 paediatric patients we saw fewer expanded clones than in the total cohort that included healthy children (Figure 4e).
In conclusion, we observed much greater clonal expansion of T and B cells in adult compared to paediatric patients, with a greater frequency of expanded memory and effector cells in adults.
These observations further characterise the differences in age-specific immune response to COVID-19, most likely determined by the differences in the underlying immune landscape. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Discussion
The analysis of our multi-omic single cell dataset of matched airway epithelial and blood immune samples in healthy and COVID-19 affected children and adults provides several significant advances in our understanding of normal immune and airway development, as well as COVID-19 pathogenesis. We present a detailed study of the immune and epithelial landscapes over healthy childhood development at single cell resolution. We used this to show clear differences between the immune and airway epithelial response to SARS-CoV-2 between children and adults, while also comparing the immune response in blood and airways in the same individuals. Additionally, we identify a novel epithelial cell state which is seen in COVID-19 patients and healthy children. A key motivation for our analysis was the current COVID-19 pandemic, which frequently starts with infection of the upper airways, where we found the highest total viral load in the surface epithelial goblet and ciliated cells. Viral infections are cleared by cell death and removal of infected cells 77,78 , which led to a highly dynamic re-structuring of the airway epithelium with a marked increase in developmental intermediates, most notably the novel inflammatory epithelial transit cell (IETC) population, which was not found in previous analysis of healthy adult airways 25,44,43 , and a decrease in basal progenitors that is re-balanced post-infection. Upon infection, all adult epithelial cell types showed a strong interferon response, which was markedly reduced in our paediatric patients. Further analysis of mild paediatric COVID-19 cases will have to determine whether this characterises all responses in children or is limited to severe disease. The lack of interferon signals in the airways may explain why ACE2 expression is not induced in paediatric COVID-19. In our data set we also see a strong neutrophil recruiting signature, driven by expression of calprotectin in different epithelial cell types, thus highlighting the key role of epithelial cells in initiating an innate immune response. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The major distinguishing feature of the immune response to SARS-CoV-2 in children in blood are striking increases in naive B and T cells as well as ILCs, with a loss of CD16 monocytes.
The lymphocyte response is strikingly different to COVID-19 immune responses in adults, described here and by others 23,24,56 , characterised by general lymphopaenia, but with increased CD8 + effector cells, as well as proliferating CD4 + and CD8 + cells in more severe disease.
Whilst activated B lineage cells are reduced in paediatric COVID-19 patients, adult patients show increased plasma cells and plasmablasts, reported to be mainly the IgG subtype, in severe disease 24 . These differences appear to have their origin primarily in the different structure of the immune repertoire in children and adults that is further discussed below. Another difference we found between children and adults is the upregulation of Tregs and downregulation of NK cells in children compared to adults.
A unique aspect of our study is the ability to relate changes in the blood to those in the airways.
In healthy children the proportions of immune cell types largely correlate between nose and blood. However, during active infection this relationship breaks down, highlighting the need to study both the active site of primary infection as well as blood to understand the dynamics of infection. For example, high levels of CTL CD8 + and MAIT cells in the nose are correlated with a reduction in the blood, likely reflecting the migration of these cells to the site of infection.
We also examined clonal expansion of B and T cells in our data set. We find that adults have far greater numbers of large, expanded T and B cell clones than children who show greater immune diversity. The lack of immune diversity due to ongoing expansion of activated clones over the lifetime, has been shown to be associated with more severe COVID-19 in adults 24 and this may add to the evidence base explaining why children usually present with much less severe disease. Also, patients with severe disease have been reported to have a greater number of expanded, low avidity common cold coronavirus specific clones that may contribute to ongoing inflammation 79 .
One limitation of our work is that only a small number of COVID-19 paediatric patients were included due to a very limited number of hospitalised children in the first wave of the pandemic and all of these had severe disease. In view of our interesting findings for neutrophil recruitment in the airways, we note that we isolated PBMCs from blood, thus excluding neutrophils from our analysis. Future studies should address this. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Overall, the significant differences in response to SARS-CoV-2 between children and adults reflect the changes of the immune landscape over developmental time, which in children are dominated by naive and innate responses, whilst those in adults are dominated by adaptive responses, illustrated by expanded lymphocytes clones. These studies may help to pinpoint the triggers of severe disease in adults with a view towards therapeutic intervention.
Our study demonstrates multiple substantially novel insights from paired multi-omics profiling of both airway epithelium and peripheral blood to fill the gap of our understanding of paediatric epithelial and immune development in health and the specific response to COVID-19. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Data and materials availability
The data set from our study can be explored interactively through a web portal: https://covid19cellatlas.org. The data object, as a h5ad file, can also be downloaded from the . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Code Availability
All data analysis scripts are available on https://github.com/Teichlab/COVID-19paed.   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint    is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint    is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  Table 1: Summary of patient metadata. Table 2: TotalSeq-C antibody panel.

Extended Data
List of 192 TotalSeq-C antibodies (Biolegend, 99814) used for CITE-seq staining of PBMCs; including DNA_ID, Clone, description of target protein and barcode.

Extended Data Table 3: Viral genomes.
A complete list of the additional 21 viral genomes used (as FASTA sequences and corresponding GTF entries) during the mapping of single cell data, including their NCBI ID, viral name and source links. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint All study participants or their surrogates provided informed consent. Individuals of at least 18 years of age with suspicion of pneumonia based on clinical criteria (including but not limited to fever, radiographic infiltrate and respiratory secretions) were screened for enrolment into the SCRIPT study. Inability to safely perform bronchoalveolar lavage or non-bronchoscopic bronchoalveolar lavage were considered exclusion criteria. In our center, patients with is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

PBMC isolation from peripheral blood
Peripheral blood was collected in EDTA immediately after the nasal brushing procedure. The blood was diluted with 5 mL of PBS containing 2 mM EDTA (Invitrogen, 1555785-038). 10 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint to 20 mL of diluted blood was carefully layered onto 15 mL of Ficoll-Paque Plus (GE healthcare, 17144002). If the sample volume was less than 5 mL, blood was diluted with an equal volume of PBS-EDTA and layered onto 3 mL Ficoll. The sample was centrifuged at 800 g for 20 min at room temperature. The plasma layer was carefully removed and the peripheral blood mononuclear cell (PBMC) layer was collected using a sterile Pasteur pipette. The PBMC layer was washed with 3 volumes of PBS containing EDTA by centrifugation at 500 g for 10 min. The pellet was suspended in PBS-EDTA and centrifuged again at 300 g for 5 min. The PBMC pellet was collected followed by both cell number and viability being assessed using Trypan Blue. Cell freezing medium (90% FBS, 10% DMSO) was added dropwise to PBMCs slowly on ice and then the mixture was cryopreserved at -80 °C until further full sample processing.

CITE-Seq staining for single-cell proteogenomics
Frozen PBMC samples were thawed quickly at 37 °C in a water bath. 20-30 mL of warm RPMI1640 medium containing 10% FBS was added slowly to the cells before centrifuging at 300 g for 5 min. This was followed by a wash in 5 mL RPMI1640-FBS. The PBMC pellet was collected, and cell number and viability were determined using Trypan Blue. PBMCs from four different donors were then pooled together at equal numbers -1. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Single Cell RNA-seq Computational Pipelines, Processing and Analysis
The single cell data was mapped to a GRCh38 ENSEMBL 93 derived reference, with an additional 21 viral genomes (featuring SARS-CoV2) included as additional FASTA sequences and corresponding GTF entries. A complete list of the included viruses, along with their respective NCBI IDs and source links, can be found in Extended Data Table 3.
Antibody-derived tag counts (ADT) and gene expression counts in CITE-seq data were jointly

Confocal microscopy method
Nasal epithelial biopsies were placed in Antigenfix (Microm Microtech) for 1-2 h at 4°C, then 30% sucrose in PBS for 12-24 h at 4°C, before cryopreservation in OCT (Cell Path). 30µm sections were permeabilised and blocked in PBS containing 0.3% Triton (Sigma), 1% normal . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint goat serum, 1% normal donkey serum, and 1% BSA (R&D) for 1-2 h at room temperature (RT). Samples were stained with a 1 in 50 dilution of anti-human S100A9 conjugated to FITC (clone: MRP 1H9, Biolegend) and a 1 in 50 dilution of anti-human EpCam conjugated to APC (clone: 9C4, Biolegend) in blocking buffer overnight and washed for 3 x 10 minutes in PBS before mounting with Fluoromount-G containing DAPI (Invitrogen). Images were acquired using a Leica SP8 confocal microscope. Raw imaging data were processed using Imaris (Bitplane).

Quality control and normalization
To account for large quality variance across different samples, quality control was done for each sample separately. QC thresholds were automatically established by fitting a 10component Gaussian mixture model to log-transformed UMI count per cell and to percentage of mitochondrial gene expression and finding the lower or higher bounds where probability density falls under 0.05. For three samples (NP32-NB, NP39-NB and AP10-NB) for which the automatic procedure failed to find reasonable thresholds, the distribution of quality metrics were examined and thresholds were determined manually. We also excluded genes expressed in fewer than 3 cells. Expression values were then normalised to a sum of 1e4 per cell and log transformed with an added pseudo-counting of 1.

Developmental trajectory inference
RNA velocity analysis was performed to infer developmental trajectory for the major epithelial cell types (excluding melanocytes, ionocytes, brush cells and neuroendocrine cells). Spliced and unspliced UMI counts were generated via the STARsolo functionality of STAR 2.7.3a.
scvelo was used to fit a dynamical model as previously described 87 , based on top 2,000 highly variable genes with at least 20 UMI for both spliced and unspliced transcripts across all cells.

CITE-seq data processing
Demultiplexing and doublet removal of PBMC samples is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint parameter. The donor ID of each souporcell genotype cluster was annotated by comparing each souporcell genotype to the set of known genotypes. Droplets that contained more than one genotype according to souporcell were flagged as 'ground-truth' doublets for heterotypic doublet identification. Ground-truth doublets were used by DoubletFinder 2.0.3 89 to empirically determine an optimal 'pK' value for doublet detection. DoubletFinder analysis was performed on each sample separately using 10 principal components, a 'pN' value of 0.25, and the 'nExp' parameter estimated from the fraction of ground-truth doublets and the number of pooled donors.

CITE-seq background and ambient RNA subtraction
Background and non-specific staining by the antibodies used in CITE-seq was estimated using SoupX version 1.4.8 86 , which models background signal on near-empty droplets. The 'soupQuantile' and 'tfidfMin' parameters were set to 0.25 and 0.2, respectively, and lowered by decrements of 0.05 until the contamination fraction was calculated using the 'autoEstCont' function. Gene expression data was also corrected with SoupX to remove cell free mRNA contamination using default SoupX parameters.

CITE-seq quality control and normalization
CITE-seq data was filtered by removing droplets with fewer than 200 genes expressed or with more than 10% of the counts originating from mitochondrial genes. Gene expression data was normalized with a log + 1 transformation (log1p), and 2000 hyper variable genes were selected with the vst algorithm in Seurat version 3.9.9.9024 90 . Antibody derived tag counts were normalized with the centered log-ratio (CLR) transformation.

Integrated embedding and clustering of CITE-seq data
Principal component analysis was run separately on gene expression and antibody derived tag count data, followed by batch correction using harmony 91 on the sequencing library identifier. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint all monocytes and DCs were reclustered after hypervariable gene selection within each subset.
Cells in WNN-based clusters with less than 100 members were reassigned based on the closest multimodal neighbour.

Comparison PBMCs and Azimuth
The manual blood cell-type annotation was validated using the Azimuth tool

Differential expression analysis in naïve T cells, B cells and monocytes
Two subsets of all naïve CD4 and CD8 T cells from healthy paediatric donors were reclustered after hypervariable gene selection within each subset to generate the UMAP in Figure 3d.

Differential expression analysis in airway data
In addition to the differential expression analysis correcting for various metadata that was performed on the whole airway and PBMC data sets as described below, results shown for subsets of the data were obtained with a simpler method. After subsetting cell types and/or age groups, a Wilcoxon rank-sum test (implemented in Scanpy 85 ) was performed to compare gene groups. The sets of differentially expressed genes were further analysed with the g:Profiler is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint CD8 naïve or effector cells because further CD8 effector subtypes were less pronounced in airway samples. The relative immune cell type proportions in matched nasal and blood samples were correlated using the Spearman rank-order method, where we only tested the correlation of cell types that were present in at least half of the nasal samples in each comparison.

Inference of ethnicity from single cell RNA-seq data
The latest biallelic SNP genotype data (GRCh38) was obtained from 1000 Genomes Project is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 15, 2021. ; https://doi.org/10.1101/2021.03.09.21253012 doi: medRxiv preprint function in the lme4 package. The statistical significance of the fold change estimate was measured by the local true sign rate (LTSR) which is an area under the curve of the posterior distribution of a log fold change over the positive domain when the point estimate is greater than 0 (over the negative domain for the point estimate less than 0).

Differential expression analysis using metadata
We performed differential gene expression analysis for both airway and PBMC data. We use implemented in R (version 2.0.1.5) to identify which pathways are enriched for differentially expressed genes for each contrast. We used genes whose ltsr is greater than 0.5 to perform the analysis (both upregulated and downregulated genes separately).

Single-cell VDJ-sequencing data analysis
TCR and BCR sequencing data was processed using the Cellranger software and downstream analysis was performed using the scirpy package (version 0.6.1) 95 . Briefly, we integrated TCR and BCR data with gene expression from T cell and B cell subsets, respectively. After categorizing cells based on the detection of productive antigen receptor chains, we selected cells with a single pair of productive chains for further analysis. Clonotypes were defined at the nucleotide level and considering both receptor chains. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  EPCAM S100A8/9 DAPI EPCAM S100A8/9 EPCAM S100A8/9 EPCAM S100A8/9 DAPI a Basal 1 (n=5808)      Figure 4