How Proteomics Got Started
by Fred Neidhardt
The following is a personal account of the beginning years of what is now called proteomics. Odd that I, almost a Luddite, should be writing about the origin of a field initiated by a dramatic technical advance; I tend to avoid complex new scientific instruments and techniques. As a graduate student under Boris Magasanik at Harvard Medical School during the early 1950s, I was glad that my project (induced enzyme synthesis in bacteria) could readily be approached with simple technology. Bacterial growth could be monitored turbidimetrically with a Klett colorimeter; the same instrument could provide colorimetric assays of enzyme activities. Only the phage geneticists of that era, using sterile toothpicks to pick viral recombinants or mutants from plaques on Petri dishes, had it technologically easier.
Around me at that time in Harvard University’s Department of Bacteriology and Immunology (now Microbiology and Molecular Genetics) were gifted individuals who on occasion were forced to purify proteins using laborious and personally onerous techniques. Not a life for me, I decided, even though H. Edwin Umbarger assured me that purifying an enzyme “developed character.”
Beside laziness, there was a second, more fundamental, reason I never purified a protein. Cell growth was the biological event that had hooked me as a graduate student, and work that began by smashing cells into little bits seemed inappropriate.Nevertheless, within the next six years I would find myself absorbed in two major aspects of cell growth physiology that involved proteins, and these subjects would prove more intractable than the purification of proteins. Catabolite repression (or, more generally, how bacterial cells choose to utilize multiple carbon sources), and growth rate modulation (how bacterial cell size and composition are interrelated with growth rate) were two processes directly related to cell growth rate.
The comprehensive work of Schaechter, Maaløe and Kjeldgaard on Salmonella cell growth rate and composition appeared shortly before my studies with Klebsiella aerogenes. Fundamental laws of bacterial growth were established by these studies in the early 1960s. Nevertheless, these laws were supported only by observation and by the easy rationale of their selective value to the cell; they were bereft of biochemical explanation. Both catabolite repression and growth rate modulation proved to be fascinating, but vexing; only now, fifty years later, are these processes approaching mechanistic solution.
A major reason for their intractability lay in limitations in our ability to approach the living bacterial cell. For most of the 20th century the study of the physiology of bacteria (and, indeed, all other organisms) was largely reductionistic. The living cell was taken apart and studied biochemically, or was dissected by the increasingly powerful marriage of biochemistry and genetics. The triumphs of this approach were notable.
Still, catabolite repression and growth rate modulation joined a list of questions that could not be answered by the reductionistic approaches of biochemistry, even when augmented by the power of genetic analysis. Questions of the following sort had to be postponed (or were never asked) because the tools to approach them were not available:Why don’t bacteria of a given species grow at the same rate on all carbon and energy sources?
- Is there a growth rate-limiting step during steady state growth of a bacterial culture?
- How many changes take place in a bacterium transitioning from growth to non-growth?
- What causes the size and macromolecular composition of a bacterial cell to be much more dependent on its rate of growth than on the chemical nature of its food?
- How do bacterial cells prioritize their choice of food when given options?
By the mid-1970’s my mind, filled with unanswered questions about growth physiology, was searching for a new way to approach the bacterial cell. That way was revealed, not by anyone in my laboratory, but by a graduate student named Patrick O’Farrell at the University of Colorado at Boulder. A postdoctoral fellow in my laboratory at the University of Michigan, Steen Pedersen, one of the keenest of disciples of Ole Maaløe in Copenhagen (and one of his most honest critics) returned from a visit to Colorado in 1974 and reported to our laboratory that a graduate student there had produced a two-dimensional polyacrylamide gel system that could resolve the proteins of a bacterial cell on an array that looked as cool as “the sky on a starry night.”
Steen’s information electrified us, for we realized that a fundamentally new approach to bacterial growth physiology had become possible. Instead of asking the cell for information about a protein of interest to us, we could finally interrogate the cell about the proteins that were important to IT in any given situation. The cell could now reveal to us what lay behind the biological Green Door (in reference to an infamous American pornographic film of that era). For the first time the road to a global analysis of cell physiology could be imagined. And, in retrospect, it is clear that the era of proteomics began in 1975, the date of publication of Patrick O’Farrell’s thesis research in the Journal of Biological Chemistry. His paper was quickly recognized by a variety of molecular biologists as a true technological breakthrough. Citations in the next 30 years numbered over 16,000 (in spite of the fact that the manuscript was initially rejected with two disparaging reviews which had to be overruled eventually by members of the journal’s editorial board).
For the first time we could now learn what the cell had to teach us about its complement of proteins and about adjustments to different environmental conditions. This new ability to listen to the cell led soon to new insights into growth rate physiology. But before this could happen it was necessary to add several features to the O’Farrell technique.
First, we recognized that we had to standardize the two-dimensional gel system of O’Farrell in order to compare the protein arrays from different samples. This required extreme attention to details of procedures and quality of reagents. The genius of O’Farrell’s system was that it employed two independent properties of proteins to separate them: their molecular weight and their isoelectric point. Isoelectric focusing in a gel tube containing ampholines to establish a pH gradient produced the first dimension—proteins lined up by their charge. Placing the resulting tubular gel on an electrophoretic gel slab containing sodium dodecyl sulfate, allowed the polypeptides previously resolved by charge now to be segregated by their size. The resulting two-dimensional polyacrylamide gel (2-D gel) was then stained and dried for subsequent inspection. A beautiful picture—but to be useful, 2-D gels had to be reproducible, and this was not an easy task for a number of reasons. In the end it took years of perfecting sample preparation and gel casting (not to mention improvements in ampholines) to get to the stage where computer-driven pattern matching could align a whole series of “starry patterns” from the multiple samples of an experiment.
Second, once the pattern-matching problem was in hand (no small feat), the issue became one of accurate measurement of the quantity of protein in the individual spots across the gel set. Clever uses, first of isotopes, then of differentially colored samples, were devised to obtain reasonable quantification. As a result, it became possible for the cell to display much of the array of changes made in its proteome (the totality of its several thousand proteins) as the cell adapted to its environment.
Fortunately, these tasks of standardizing and quantifying O’Farrell gels were approached by many individuals skilled in scientific technology. James Garrels at Cold Spring Harbor Laboratory, Norman G. and N. Leigh Anderson at Argonne National Laboratory, and Julio Celis at the University of Aarhus, Denmark, were some of the people who early on used their considerable skills to expand the usefulness of 2-D gel technology.
But still a third attribute had to be added to 2-D gels for maximum usefulness: the identities of the “starry” spots on the gels had to be determined. For the bacterium Escherichia coli and its close cousins, my laboratory in Ann Arbor mounted a full-scale effort to correlate spots on the 2-D gels with known proteins. Hundreds of protein spots were identified through the use of purified proteins (donated, naturally, by others) and mutants in known genes. Everyone in my laboratory contributed to this effort; unfair as this is, I’ll single out only two because of their germinal work in identifying spots and because of their tireless energy in teaching the 2-D gel process to all the others: Ruth A. VanBogelen and Teresa Phillips.
Needless to say, the identification of spots might be regarded as tedious drudgery—and it was—save for the thrill that we were simultaneously making discovery after discovery using the 2-D gels: heat-shock and cold-shock proteins, proteins under stringent control, proteins that vary monotonically with growth temperature, proteins that vary with growth rate—and we were not simply learning which proteins exhibit a certain behavior, but what fraction of the cell’s proteome was involved in different physiological responses to stress or starvation. These discoveries led Ruth VanBogelen and her colleagues to the concept of protein signatures. A protein signature is the set of proteins that, by their amplification or suppression, signal a particular physiological stress state of the cells. One learned how to recognize when a cell was in a state of energy starvation, or oxidative stress, or membrane damage, or… the list goes on. One can imagine the gigantic usefulness of this approach when a pharmaceutical company is exploring how a potential therapeutic agent acts.
But we should bring this story to a close quickly, because from the mid 1990s onward the explosion of cell protein technology transformed the field from what Pat O’Farrell had created to one with a formidable arsenal of techniques for protein resolution and measurement. The term proteome was introduced in 1996 to refer to the totality of proteins in a cell, and this quickly gave rise to the noun, proteomics, to designate studies of the proteome. The 2-D gel technique introduced by Pat O’Farrell has inspired others to develop improved techniques for monitoring the global pattern of a cell’s total protein complement. The availability of DNA sequences with reasonably accurate annotations, for the genomes of hundreds of species has made it possible to develop separation techniques that enable tandem mass spectrometry to provide the “second dimension” to primary fractionation procedures, and as a result, enable protein identifications an order of magnitude beyond that which was achieved in the first two decades of the 2-D era.
To be sure, the current armamentarium of proteomics is being used in highly targeted ways to explore previously identified sets of “proteins of interest” (as our law enforcement agencies might call them), but I want to emphasize that Pat O’Farrell’s development of the first method of spreading out the proteins of a cell was at the start, and particularly for me, the initiation of an exciting new grammar of scientific questioning.
Frederick C. Neidhardt is F.G. Novy Distinguished University Professor, Emeritus, Department of Microbiology and Immunology, University of Michigan Medical School at Ann Arbor.