Elsevier

New Biotechnology

Volume 29, Issue 5, 15 June 2012, Pages 543-549
New Biotechnology

Research paper
Advances in human proteomics at high scale with the SOMAscan proteomics platform

https://doi.org/10.1016/j.nbt.2011.11.016Get rights and content

In 1997, while still working at NeXstar Pharmaceuticals, several of us made a proteomic bet. We thought then, and continue to think, that proteomics offers a chance to identify disease-specific biomarkers and improve healthcare. However, interrogating proteins turned out to be a much harder problem than interrogating nucleic acids. Consequently, the ‘omics’ revolution has been fueled largely by genomics. High-scale proteomics promises to transform medicine with personalized diagnostics, prevention, and treatment. We have now reached into the human proteome to quantify more than 1000 proteins in any human matrix – serum, plasma, CSF, BAL, and also tissue extracts – with our new SOMAmer-based proteomics platform. The surprising and pleasant news is that we have made unbiased protein biomarker discovery a routine and fast exercise. The downstream implications of the platform are substantial.

Introduction

The new fields of ‘omics’ have changed research in and understanding of biology. For many years scientists have been trained to do hypothesis-driven experiments, yet omics have made possible a new kind of research – hypothesis generating experiments 1, 2, 3, 4, 5. The core of this transition is that omics provide a deep experimental outcome that allows evidence, gathered in an unbiased form (see below), to guide further experiments. The notion is simple yet profound: when experiments are difficult and data sparse, one needs a hypothesis to track toward useful answers. Conversely, when experiments are easy and data voluminous, one needs after-thought more than a hypothesis (good experimental design remains essential).

The omics revolution provides an opportunity to discover biologically useful markers for health status. However, one must wrestle with the omics dilemma – should we investigate genomics (be it SNPs, CNVs, or epigenetics), transcriptomics, microRNAs, proteomics, metabolomics, and which is likely to yield the most crucial information for understanding human biology and health?

We decided in 1997 that proteomics was most likely to yield the most valuable biological and health information [6]. We recognize, however, that everything we think about protein biomarkers is thought by people working in the other omics arenas. It will take another decade of experimental work to know whether proteomics will yield the most valuable information about health. The other non-proteomic omics were easier to approach experimentally – genomics, transcriptomics, and microRNAs are essentially the outgrowth of hybridization and now sequencing technologies. Metabolomics is today driven by machines we have understood for years – NMR and mass spectrometry (MS). Proteomics has been dominated by two technologies that are, so far, inadequate for deep unbiased proteomic measurements – enzyme-linked immuno-sorbent assays (ELISAs), based on antibodies, and MS. Large unbiased arrays of ELISAs do not scale to large numbers of analytes 5, 7 while MS is difficult to do quantitatively and is best used for the most abundant proteins in a sample 8, 9, 10, 11, 12, 13, 14, 15. Thus, the so-called omics revolution refers to insights mostly gained from large-scale studies of nucleic acids, not proteins.

Advances in high-scale proteomics would transform personalized medicine by enabling the unbiased discovery of protein biomarkers essential for developing powerful new diagnostics and treatments. Proteins provide an immediate measure of health status that is not reflected in the genome, which remains largely the same throughout life. We often show a photo of the young Jim Watson, gazing lovingly with Francis Crick at the structure of DNA, and we contrast that picture with a photo (from the web, of course) of Watson decades later when his complete genome sequence was published [16]. Watson's health status has changed, of course, but his genome sequence is essentially unchanged. Watson's genome sequence can provide information about his relative risk for diseases and perhaps what happened to him biologically during his long productive career, but a profile of his blood proteome might provide an immediate measure of his health status and detect early disease when treatment is more successful.

With the astounding development of today's DNA sequencing technology [17] we can easily imagine sequencing everyone's genome, if (or when, as is commonly assumed) this proved useful. The goal of SomaLogic is to allow profiling people's blood proteome economically and regularly. We believe that such longitudinal and personal proteomic information will deliver powerful and actionable information to people, relating to their health and disease. This idea is shared, of course, with many groups – Mathias Uhlén and Leroy Hood come immediately to mind 18, 19, 20. However, high-scale proteomics has proved more difficult than high-scale genomics due to the difficulty of protein measurement paradigms. The human proteome contains an estimated 20,000 proteins – plus splicing and post-translational variants – that span a concentration range of at least ∼12 logs and probably more [21]. Proteomic measurements demand extreme sensitivity, specificity, dynamic range, and accurate quantification 7, 22. Unlike genomic technologies, proteomics has no equivalent to hybridization 23, 24. Hybridization has the allure of typing – with no knowledge other than the rules for base-pairing, any nucleic acid sequence immediately yields its complement, a sequence that will bind to the initial sequence with high accuracy and affinity. Hybridization was the key element in the genomic revolution. We note, parenthetically, that the ‘allure of typing’ was behind the attractiveness of several drug development paradigms: anti-sense, catalytic antisense (ribozymes), and, more recently, siRNAs.

Section snippets

Efforts at high-scale proteomics

Attempts at high-scale proteomics began with 2D gels, including the pioneering work by Patrick O’Farrel conducted in the Gold Lab [25]. Current approaches to high-scale proteomics mostly employ MS and antibody-based technologies 7, 13. MS can deliver specific analytical capabilities and the technology has advanced remarkably over the past decade to overcome many technical and methodological challenges to the point where experts believe it is ‘ready for the big time’ of high throughput

A novel high-scale proteomics solution

We have developed a new proteomics technology platform [39] (also see recent News and Views 40, 41) and demonstrated its capability of human proteomics at high scale [42]. Our SOMAscan proteomics platform comprises a new class of SOMAmer molecular recognition elements (MREs) and a novel affinity assay, which together constitute a platform with exceptional analytical performance, rapid extensibility to measure new analytical targets, and great flexibility for reconfiguration. Our current assay

Protein target menu

The current version of our assay measures 1030 human proteins (http://www.somalogic.com). We estimate that ∼5000 of the ∼20,000 proteins in the human proteome are present in blood and represent potential blood-based biomarkers, and we are steadily expanding our target menu. The current collection of 1030 protein targets represents a wide range of sizes, physicochemical properties, such as isoelectric points (pI) of proteins that range from 4 to 11 and follow the abundance distribution of

Diagnostics

We are applying the SOMAscan proteomics platform to develop diagnostics for conditions of unmet clinical need in several areas including oncology, cardiovascular disease, renal disease, neurological disease, inflammatory disease, infectious disease, and also wellness. Frequently in our studies, the distributions of biomarker concentrations among two populations overlap to some degree, which creates the need to combine multiple biomarkers to achieve the most accurate diagnosis.

Below we highlight

Chronic kidney disease

We recently published a clinical biomarker study of chronic kidney disease (CKD), the slow loss of kidney function over time [39]. CKD is a growing global public health epidemic that is ‘common, harmful, and treatable’ with an estimated prevalence of nearly 10% worldwide [28]. Early intervention in CKD can substantially improve prognosis, which is otherwise poor 28, 29, 30, 31. Early diagnosis of CKD requires predictive, non-invasive biomarkers, which could also be useful for monitoring disease

Lung cancer

The first published large-scale clinical application of our SOMAscan platform was a study to discover and verify novel biomarkers for lung cancer [42]. This is among the most comprehensive proteomic biomarker studies published to date. Lung cancer is the leading cause of cancer deaths because ∼84% of cases are diagnosed at an advanced stage 1, 2, 3. Worldwide in 2008, ∼1.5 million people were diagnosed and ∼1.3 million died [4] – a survival rate unchanged since 1960. However, patients who are

Conclusion

Together these published and unpublished results demonstrate the power of the SOMAscan proteomic technology for discovering robust biomarkers for developing new products for personalized medicine. Our general approach of unbiased biomarker discovery can be applied to many more conditions including infectious, inherited, neurological, and metabolic diseases. The SOMAscan platform is enabling a broad pipeline of diagnostics across many diseases.

Acknowledgements

We have many people to thank for contributing immeasurably to this effort with countless insights about aptamers, SOMAmers, proteomics, biochemistry, bioinformatics, engineering, medicine, clinical studies, and the overarching nature of biology. We especially thank past and present colleagues at SomaLogic, NeXstar Pharmaceuticals, and the University of Colorado. We also thank our collaborators for the biomarker discovery work reviewed here: William L. Bigbee, Wilbur Franklin, York E. Miller,

References (54)

  • S.L. Seurynck-Servoss

    Evaluation of surface chemistries for antibody microarrays

    Analytical Biochemistry

    (2007)
  • R. Ostroff

    The stability of the circulating human proteome to variations in sample collection and handling procedures measured with an aptamer-based proteomics array

    Journal of Proteomics

    (2010)
  • G.J. Nabel

    The coordinates of truth

    Science

    (2009)
  • A.D. Lander

    The edges of understanding

    BMC Biology

    (2010)
  • D.J. Glass

    A critique of the hypothesis, and a defense of the question, as a framework for experimentation

    Clinical Chemistry

    (2010)
  • D.B. Kell et al.

    Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era

    BioEssays

    (2004)
  • E.N. Brody et al.

    High-content affinity-based proteomics: unlocking protein biomarker discovery

    Expert Review of Molecular Diagnostics

    (2010)
  • T.A. Addona

    Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma

    Nature Biotechnology

    (2009)
  • R. Aebersold

    A stress test for mass spectrometry-based proteomics

    Nature Methods

    (2009)
  • A.W. Bell

    HUPO test sample study reveals common problems in mass spectrometry-based proteomics

    Nature Methods

    (2009)
  • L.A. Liotta et al.

    Mass spectrometry-based protein biomarker discovery and measurement: sensitivity is the greatest hurdle

    Clinical Proteomics

    (2010)
  • P. Mitchell

    Proteomics retrenches

    Nature Biotechnology

    (2010)
  • S. Pan

    Mass spectrometry based targeted protein quantification: methods and applications

    Journal of Proteome Research

    (2009)
  • R.F. Service

    Proteomics ponders prime time

    Science

    (2008)
  • D.A. Wheeler

    The complete genome of an individual by massively parallel DNA sequencing

    Nature

    (2008)
  • J. Shendure et al.

    Next-generation DNA sequencing

    Nature Biotechnology

    (2008)
  • M. Uhlen

    Towards a knowledge-based Human Protein Atlas

    Nature Biotechnology

    (2010)
  • Cited by (0)

    View full text