Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Perspectives on ENCODE

An Author Correction to this article was published on 26 April 2022

This article has been updated

Abstract

The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: ENCODE assays by year.
Fig. 2: Progress in annotating the human genome.
Fig. 3: Publications using ENCODE data.
Fig. 4: An overview of the mouse ENCODE Project in the current phase.

Similar content being viewed by others

Change history

References

  1. Kellis, M. et al. Defining functional DNA elements in the human genome. Proc. Natl Acad. Sci. USA 111, 6131–6138 (2014).

    Article  ADS  CAS  Google Scholar 

  2. ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004).

    Article  ADS  Google Scholar 

  3. Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).

    Article  CAS  Google Scholar 

  4. Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).

    Article  ADS  CAS  Google Scholar 

  5. ENCODE Project Consortium. A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 9, e1001046 (2011).

    Article  Google Scholar 

  6. Birney, E. et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007). The results of the pilot phase of ENCODE included extensive functional assays across a selected one per cent of the human genome with experiments conducted on a variety of cell lines and largely with array-based technology.

    Article  ADS  CAS  Google Scholar 

  7. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). The results of the second phase of ENCODE were based mostly on a large number of genome-wide assays that leveraged high-throughput sequencing technologies and were done across two ‘tier one’ cell lines with large-scale assays across several hundred cell and tissue types.

    Article  ADS  Google Scholar 

  8. The ENCODE Project Consortium et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature https://doi.org/10.1038/s41586-020-2493-4 (2020).

  9. Partridge, E. C. et al. Occupancy maps of 208 chromatin-associated proteins in one human cell type. Nature https://doi.org/10.1038/s41586-020-2023-4 (2020).

  10. Meuleman, W. Index and biological spectrum of human DNase I hypersensitive sites. Nature https://doi.org/10.1038/s41586-020-XXXX-X (2020).

  11. Vierstra, J. et al. Global reference mapping of human transcription factor footprints. Nature https://doi.org/10.1038/s41586-020-2528-x (2020). 

  12. Breschi, A. et al. A limited set of transcriptional programs define major cell types. Preprint at https://doi.org/10.1101/857169 (2020).

  13. Grubert, F. et al. Landscape of cohesin-mediated chromatin loops in the human genome. Nature https://doi.org/10.1038/s41586-020-2151-x (2020).

  14. Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA binding proteins. Nature https://doi.org/10.1038/s41586-020-2077-3 (2020).

  15. Wang, Z., Gerstein, M. & Snyder, M. RNA-seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).

    Article  CAS  Google Scholar 

  16. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  Google Scholar 

  17. Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein–DNA interactions. Science 316, 1497–1502 (2007).

    Article  ADS  CAS  Google Scholar 

  18. Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651–657 (2007).

    Article  CAS  Google Scholar 

  19. Iyer, V. R. et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538 (2001).

    Article  ADS  CAS  Google Scholar 

  20. Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).

    Article  ADS  CAS  Google Scholar 

  21. Landt, S. G. et al. ChIP–seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012). A consortium-wide effort to standardize performance, quality control and outputs of ChIP–seq experiments, including validation of antibodies, to facilitate experimental reproducibllity and data utility.

    Article  CAS  Google Scholar 

  22. Sundararaman, B. et al. Resources for the comprehensive discovery of functional RNA elements. Mol. Cell 61, 903–913 (2016).

    Article  CAS  Google Scholar 

  23. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  ADS  CAS  Google Scholar 

  24. Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S. & Snyder, M. Linking disease associations with regulatory information in the human genome. Genome Res. 22, 1748–1759 (2012).

    Article  CAS  Google Scholar 

  25. Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014). Results of a large-scale effort of the mouse ENCODE consortium, presenting regulatory and transcript maps of the mouse.

    Article  ADS  CAS  Google Scholar 

  26. Gerstein, M. B. et al. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330, 1775–1787 (2010).

    Article  ADS  CAS  Google Scholar 

  27. The modENCODE Consortium et al. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330, 1787–1797 (2010).

    Article  Google Scholar 

  28. Kudron, M. M. et al. The ModERN Resource: genome-wide binding profiles for hundreds of Drosophila and Caenorhabditis elegans transcription factors. Genetics 208, 937–949 (2018).

    Article  CAS  Google Scholar 

  29. Gorkin, D. U. et al. An atlas of dynamic chromatin landscapes in mouse fetal development. Nature https://doi.org/10.1038/s41586-020-2093-3 (2020).

  30. He, P. A. The changing mouse embryo transcriptome at whole tissue and single-cell resolution. Nature https://doi.org/10.1038/s41586-020-XXXX-X (2020).

  31. He, Y. et al. Spatiotemporal DNA methylome dynamics of the developing mouse fetus. Nature https://doi.org/10.1038/s41586-020-2119-x (2020).

  32. Cheng, Y. et al. Principles of regulatory information conservation between mouse and human. Nature 515, 371–375 (2014).

    Article  ADS  CAS  Google Scholar 

  33. Stefflova, K. et al. Cooperativity and rapid evolution of cobound transcription factors in closely related mammals. Cell 154, 530–540 (2013).

    Article  CAS  Google Scholar 

  34. Keilwagen, J., Posch, S. & Grau, J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol. 20, 9 (2019).

    Article  Google Scholar 

  35. Tang, F., Lao, K. & Surani, M. A. Development and applications of single-cell transcriptome analysis. Nat. Methods 8 (Suppl), S6–S11 (2011).

    Article  CAS  Google Scholar 

  36. Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).

    Article  ADS  CAS  Google Scholar 

  37. Hu, B. C.; HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).

    Article  ADS  Google Scholar 

  38. Regev, A. et al. The human cell atlas. eLife 6, e27041 (2017).

    Article  Google Scholar 

  39. Rhoads, A. & Au, K. F. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13, 278–289 (2015).

    Article  Google Scholar 

  40. Arnold, C. D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).

    Article  ADS  CAS  Google Scholar 

  41. Klein, J. C., Chen, W., Gasperini, M. & Shendure, J. Identifying novel enhancer elements with CRISPR-based screens. ACS Chem. Biol. 13, 326–332 (2018).

    Article  CAS  Google Scholar 

  42. Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    Article  CAS  Google Scholar 

  43. Paudyal, A. et al. The novel mouse mutant, chuzhoi, has disruption of Ptk7 protein and exhibits defects in neural tube, heart and lung development and abnormal planar cell polarity in the ear. BMC Dev. Biol. 10, 87 (2010).

    Article  Google Scholar 

Download references

Acknowledgements

We thank S. Moore, E. Cahill, M. Kellis and J. Li for their assistance, and B. Wold for helpful comments. This work was supported by grants from the NIH: U01HG007019, U01HG007033, U01HG007036, U01HG007037, U41HG006992, U41HG006993, U41HG006994, U41HG006995, U41HG006996, U41HG006997, U41HG006998, U41HG006999, U41HG007000, U41HG007001, U41HG007002, U41HG007003, U41HG007234, U54HG006991, U54HG006997, U54HG006998, U54HG007004, U54HG007005, U54HG007010 and UM1HG009442.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

The role of the NHGRI Project Management Group in the preparation of this paper was limited to coordination and scientific management of the ENCODE consortium. All other authors contributed to the concepts, writing and/or revisions of this manuscript.

Corresponding author

Correspondence to Michael P. Snyder.

Ethics declarations

Competing interests

B.E.B. declares outside interests in Fulcrum Therapeutics, 1CellBio, HiFiBio, Arsenal Biosciences, Cell Signaling Technologies, BioMillenia, and Nohla Therapeutics. P.F. is a member of the Scientific Advisory Boards of Fabric Genomics, Inc. and Eagle Genomics, Ltd. M.P.S. is cofounder and scientific advisory board member of Personalis, SensOmics, Mirvie, Qbio, January, Filtricine, and Genome Heart. He serves on the scientific advisory board of these companies and Genapsys and Jupiter. Z.W. is a cofounder of Rgenta Therapeutics and she serves on its scientific advisory board. R.M.M. is an advisor to DNAnexus and Decheng Capital, and has outside interests in IMIDomics, Accuragen and ReadCoor, Inc. The authors declare no other competing financial interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 ENCODE timeline.

Pilot phase: September 2003–September 2007; ENCODE 2: September 2007–September 2012; ENCODE 3: September 2012–January 2017; ENCODE 4: February 2017–present; modENCODE: April 2007–April 2012; mouse ENCODE: 2009–2012.

Supplementary information

Supplementary Information

This file contains the full author list for The ENCODE Project Consortium, and Supplementary Note 1 (Useful URLs).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

The ENCODE Project Consortium., Snyder, M.P., Gingeras, T.R. et al. Perspectives on ENCODE. Nature 583, 693–698 (2020). https://doi.org/10.1038/s41586-020-2449-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-020-2449-8

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing