Research paperAdvances in human proteomics at high scale with the SOMAscan proteomics platform
Introduction
The new fields of ‘omics’ have changed research in and understanding of biology. For many years scientists have been trained to do hypothesis-driven experiments, yet omics have made possible a new kind of research – hypothesis generating experiments 1, 2, 3, 4, 5. The core of this transition is that omics provide a deep experimental outcome that allows evidence, gathered in an unbiased form (see below), to guide further experiments. The notion is simple yet profound: when experiments are difficult and data sparse, one needs a hypothesis to track toward useful answers. Conversely, when experiments are easy and data voluminous, one needs after-thought more than a hypothesis (good experimental design remains essential).
The omics revolution provides an opportunity to discover biologically useful markers for health status. However, one must wrestle with the omics dilemma – should we investigate genomics (be it SNPs, CNVs, or epigenetics), transcriptomics, microRNAs, proteomics, metabolomics, and which is likely to yield the most crucial information for understanding human biology and health?
We decided in 1997 that proteomics was most likely to yield the most valuable biological and health information [6]. We recognize, however, that everything we think about protein biomarkers is thought by people working in the other omics arenas. It will take another decade of experimental work to know whether proteomics will yield the most valuable information about health. The other non-proteomic omics were easier to approach experimentally – genomics, transcriptomics, and microRNAs are essentially the outgrowth of hybridization and now sequencing technologies. Metabolomics is today driven by machines we have understood for years – NMR and mass spectrometry (MS). Proteomics has been dominated by two technologies that are, so far, inadequate for deep unbiased proteomic measurements – enzyme-linked immuno-sorbent assays (ELISAs), based on antibodies, and MS. Large unbiased arrays of ELISAs do not scale to large numbers of analytes 5, 7 while MS is difficult to do quantitatively and is best used for the most abundant proteins in a sample 8, 9, 10, 11, 12, 13, 14, 15. Thus, the so-called omics revolution refers to insights mostly gained from large-scale studies of nucleic acids, not proteins.
Advances in high-scale proteomics would transform personalized medicine by enabling the unbiased discovery of protein biomarkers essential for developing powerful new diagnostics and treatments. Proteins provide an immediate measure of health status that is not reflected in the genome, which remains largely the same throughout life. We often show a photo of the young Jim Watson, gazing lovingly with Francis Crick at the structure of DNA, and we contrast that picture with a photo (from the web, of course) of Watson decades later when his complete genome sequence was published [16]. Watson's health status has changed, of course, but his genome sequence is essentially unchanged. Watson's genome sequence can provide information about his relative risk for diseases and perhaps what happened to him biologically during his long productive career, but a profile of his blood proteome might provide an immediate measure of his health status and detect early disease when treatment is more successful.
With the astounding development of today's DNA sequencing technology [17] we can easily imagine sequencing everyone's genome, if (or when, as is commonly assumed) this proved useful. The goal of SomaLogic is to allow profiling people's blood proteome economically and regularly. We believe that such longitudinal and personal proteomic information will deliver powerful and actionable information to people, relating to their health and disease. This idea is shared, of course, with many groups – Mathias Uhlén and Leroy Hood come immediately to mind 18, 19, 20. However, high-scale proteomics has proved more difficult than high-scale genomics due to the difficulty of protein measurement paradigms. The human proteome contains an estimated 20,000 proteins – plus splicing and post-translational variants – that span a concentration range of at least ∼12 logs and probably more [21]. Proteomic measurements demand extreme sensitivity, specificity, dynamic range, and accurate quantification 7, 22. Unlike genomic technologies, proteomics has no equivalent to hybridization 23, 24. Hybridization has the allure of typing – with no knowledge other than the rules for base-pairing, any nucleic acid sequence immediately yields its complement, a sequence that will bind to the initial sequence with high accuracy and affinity. Hybridization was the key element in the genomic revolution. We note, parenthetically, that the ‘allure of typing’ was behind the attractiveness of several drug development paradigms: anti-sense, catalytic antisense (ribozymes), and, more recently, siRNAs.
Section snippets
Efforts at high-scale proteomics
Attempts at high-scale proteomics began with 2D gels, including the pioneering work by Patrick O’Farrel conducted in the Gold Lab [25]. Current approaches to high-scale proteomics mostly employ MS and antibody-based technologies 7, 13. MS can deliver specific analytical capabilities and the technology has advanced remarkably over the past decade to overcome many technical and methodological challenges to the point where experts believe it is ‘ready for the big time’ of high throughput
A novel high-scale proteomics solution
We have developed a new proteomics technology platform [39] (also see recent News and Views 40, 41) and demonstrated its capability of human proteomics at high scale [42]. Our SOMAscan proteomics platform comprises a new class of SOMAmer molecular recognition elements (MREs) and a novel affinity assay, which together constitute a platform with exceptional analytical performance, rapid extensibility to measure new analytical targets, and great flexibility for reconfiguration. Our current assay
Protein target menu
The current version of our assay measures 1030 human proteins (http://www.somalogic.com). We estimate that ∼5000 of the ∼20,000 proteins in the human proteome are present in blood and represent potential blood-based biomarkers, and we are steadily expanding our target menu. The current collection of 1030 protein targets represents a wide range of sizes, physicochemical properties, such as isoelectric points (pI) of proteins that range from 4 to 11 and follow the abundance distribution of
Diagnostics
We are applying the SOMAscan proteomics platform to develop diagnostics for conditions of unmet clinical need in several areas including oncology, cardiovascular disease, renal disease, neurological disease, inflammatory disease, infectious disease, and also wellness. Frequently in our studies, the distributions of biomarker concentrations among two populations overlap to some degree, which creates the need to combine multiple biomarkers to achieve the most accurate diagnosis.
Below we highlight
Chronic kidney disease
We recently published a clinical biomarker study of chronic kidney disease (CKD), the slow loss of kidney function over time [39]. CKD is a growing global public health epidemic that is ‘common, harmful, and treatable’ with an estimated prevalence of nearly 10% worldwide [28]. Early intervention in CKD can substantially improve prognosis, which is otherwise poor 28, 29, 30, 31. Early diagnosis of CKD requires predictive, non-invasive biomarkers, which could also be useful for monitoring disease
Lung cancer
The first published large-scale clinical application of our SOMAscan platform was a study to discover and verify novel biomarkers for lung cancer [42]. This is among the most comprehensive proteomic biomarker studies published to date. Lung cancer is the leading cause of cancer deaths because ∼84% of cases are diagnosed at an advanced stage 1, 2, 3. Worldwide in 2008, ∼1.5 million people were diagnosed and ∼1.3 million died [4] – a survival rate unchanged since 1960. However, patients who are
Conclusion
Together these published and unpublished results demonstrate the power of the SOMAscan proteomic technology for discovering robust biomarkers for developing new products for personalized medicine. Our general approach of unbiased biomarker discovery can be applied to many more conditions including infectious, inherited, neurological, and metabolic diseases. The SOMAscan platform is enabling a broad pipeline of diagnostics across many diseases.
Acknowledgements
We have many people to thank for contributing immeasurably to this effort with countless insights about aptamers, SOMAmers, proteomics, biochemistry, bioinformatics, engineering, medicine, clinical studies, and the overarching nature of biology. We especially thank past and present colleagues at SomaLogic, NeXstar Pharmaceuticals, and the University of Colorado. We also thank our collaborators for the biomarker discovery work reviewed here: William L. Bigbee, Wilbur Franklin, York E. Miller,
References (54)
The use of aptamers in large arrays for molecular diagnostics
Molecular Diagnosis: a journal devoted to the understanding of human disease through the clinical application of molecular biology
(1999)- et al.
Proteomics and diagnostics: Let's Get Specific, again
Current Opinion in Chemical Biology
(2008) - et al.
Biomarker discovery and clinical proteomics
Trends in Analytical Chemistry
(2010) The human plasma proteome: history, character, and diagnostic prospects
Molecular & Cellular Proteomics
(2002)- et al.
Let's get specific: the relationship between specificity and affinity
Chemistry & Biology
(1995) High resolution two-dimensional electrophoresis of proteins
The Journal of Biological Chemistry
(1975)ISCAPA peptide enrichment on magnetic beads using an in-line bead trap device
Molecular & Cellular Proteomics
(2009)- et al.
An automated and multiplexed method for high throughput peptide immunoaffinity enrichment and multiple reaction monitoring mass spectrometry-based quantification of protein biomarkers
Molecular & Cellular Proteomics
(2010) Kinetic amplification of enzyme discrimination
Biochimie
(1975)Developments in the production of biological and synthetic binders for immunoassay and sensor-based detection of small molecules
Trends in Analytical Chemistry
(2011)