by Brooke Anderson
Don't you wonder how a single species can range from protective gut commensal to lethal pathogen to laboratory best-friend, and yet still be Escherichia coli ? With such broad variety, how similar can all these strains be? True, their conserved genes have a 98% sequence similarity, but other genes vary by up to 40% between the lab strain and pathogenic strains.1 Even members of the same serotypes vary greatly. In fact, the Shigella species, which since practically day one of microbiology have been deemed worthy of their own genus, map phylogenetically within the wide boundaries of E. coli .2
With worlds of microbiology still unexplored, E. coli may seem like a bland and rather exhausted topic. Yet I have been musing whether our domesticated lab strains of E. coli are actually good models for gram negative bacteria in nature. Of course E. coli today is frequently used to simply stock genetic material, but there are also many labs that use it to investigate basic prokaryotic physiology and behavior. Is this justified?
E. coli is an old friend in the lab, and was so even before tractable toolkits were established to make genetic manipulation so much easier. What was it about E. coli a century ago that prompted the scientific community to adopt it so completely as the prokaryotic model?
E. coli ’s history begins in 1886 when the bacterium was discovered by Theodor Escherich. Inspired by the germ theory of disease and the then-current work of Louis Pasteur and Robert Koch, the German-Austrian pediatrician collected feces of his young patients. From infants with diarrhea, he isolated a fast-growing, rod-shaped microbe that he named 'Bacterium coli commune'. Following his death in 1919, this bacterium was renamed after him Escherichia coli . According to Joshua Lederberg, a 1958 Nobel Prize winner for discovering bacterial conjugation, E. coli was used by researchers from the very start for the reasons most model species are favored: it grew fast on various media, it was easily isolated and identified on media such as MacConkey's agar, and many strains were harmless.
The parent of our modern lab strain E. coli MG1655 is the K-12 strain, isolated in 1922 from the stool of a diphtheria patient and stored in Stanford stocks (as of the late 1990s the original strain was still maintained in stab cultures).3 Berkmen and Riggs previously presented the story of this strain on these pages here in STC. In the 1930's, Charles Clifton at Stanford used the stock culture to study the growth characteristics of E. coli , presumably because of how simple it is to grow in liquid culture, but also because "of the more extensive information available on its metabolism and the ease with which contaminants may be detected." 4 This echoes Lederberg's assessment that early on researchers valued E. coli 's rapid growth, innocuous nature, and ease of identification.
From Clifton, the K-12 strain moved into the hands of Edward L. Tatum for early biochemical studies. Following Avery, MacLeod, and McCarty's evidence that DNA was the hereditary material in strains of pneumococci, Joshua Lederberg, then working in E. L. Tatum's lab, began to work on establishing genetic tools for his then-favored fungus, Neurospora. He apparently ran into enough problems in creating mutants in this species that he began to look for a better candidate.5 When he was given Tatum's K-12 strain, he readily created auxotrophic mutants that, when mixed together, eliminated each other's auxotrophies and provided the first evidence of a prokaryote's ability to sexually transfer genes.6
It turns out that most wild strains of E. coli do not conjugate. So how did Tatum & Lederberg manage to stumble upon one of the few strains that did? According to Barbara Bachmann, a widely regarded E. coli historian, "that this virile strain of E. coli, one of the relatively few found to possess significant fertility in the laboratory, should have been the one which C. E. Clifton chose to give to E. L. Tatum as the latter set out to produce mutant strains of bacteria was apparently just a particularly happy accident. K-12 was thought to be an entirely typical coli culture." 7 From here on out what Lederberg dubbed the biological "Matthew effect" (in reference to a line in the Gospel of Matthew which essentially posits that the rich get richer and the poor get poorer) took hold: E. coli K-12, with the richest collection of knowledge and tools, continued to accumulate more strain-centric findings as it was passaged out in labs across the globe.8 Bacteriophage research became centered on E. coli and its lambda phage. Arthur Kornberg used it to make his seminal contribution of DNA replication in 1956.
In the 50 years since the accretion of genetic tools for E. coli , the K-12 strain has generated an extended pedigree. In 1972 Barbara Bachmann began to collect and arrange a large collection of strains from an impressive foray through the records of several labs. The MG1655 commonly used today derived from Lederberg’s K-12 collection, where it was cured of the lambda phage by UV light and of the F plasmid by acridine orange so that it could no longer perform conjugation.7 The MG1655 strain (named by Mark Guyer) is considered to be the closest to the original K-12 strain, minus these two lost genes, mutations in the ilvG and rfb genes, uncertainty about the "true" wild-type allele of the rpoS gene, and a frameshift mutation in the rph gene encoding RNase PH.9
So it seems reasonable to conclude that E. coli became the most studied prokaryote, deliberately or not, for the exact reasons that most models arise: it grows fast and cheaply; can be isolated and identified by simple methods; it won't harm the researchers; and a good amount of literature on it existed. So, what can E. coli still tell us about prokaryotes in nature?
Obviously, no single organism is representative of others even of its own species, much less beyond that. This is true even for a laboratory strain that has been passaged under laboratory conditions for decades because it accumulates mutations to best suit it growing in the lab on media it's fed. Richard Lenski showed that after 33,000 generations over fifteen years, E. coli REL606 evolved the ability to aerobically utilize citrate as a carbon source, which it has never been able to do in the past.10 A similar experiment tested which genes could provide a fitness advantage when inactivated and passaged in liquid culture for 60 – 90 generations. The lost genes that were found to provide a fitness advantage were flagellar genes, which aid in mobility and biofilm formation but are energy expensive.11
This is an example of where studying something so biologically fundamental as flagellar structure could be undermined by the domestication of the strain. Comparing genotypes of strains, both lab domesticated and isolated from the wild, has been and will continue to be aided by the ease of sequencing prokaryotic genomes. However, I would also underline the importance of understanding and considering the media used to propagate strains. For E. coli strains isolated from the wild, especially when involved in health studies such as food safety, studies might benefit from a synthetic medium customized to the substrates and conditions typical of the environment where they were found.
For an added layer of complexity, modern microbiologists are now hyper-aware of the microbial biomes that exist everywhere, modulating the activity of their constituents. The natural environments in which wild strains are found always include other microbial life. Garth Ehrlich and colleagues note that "the biofilm provides an ideal setting for bacterial horizontal gene transfer." 12 They thus proposes "The Distributed Genome Hypothesis," which looks through the whole population to accumulate a species-genome, therefore accounting for instances of horizontal gene transfer. This allows researchers to zoom out from the lone, unrepresentative strain of E. coli to examine the capabilities of all members of a species. With this new perspective of studying microbes such as E. coli in the context of their environmental neighbors, we can start to understand how and why E. coli genomes have diverged to such extremes, both in labs and in nature.
- Welch RA, Burland V, Plunkett G 3rd, Redford P, Roesch P, Rasko D, Buckles EL, Liou SR, Boutin A, Hackett J, Stroud D, Mayhew GF, Rose DJ, Zhou S, Schwartz DC, Perna NT, Mobley HL, Donnenberg MS, Blattner FR. 2002. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc Natl Acad Sci U S A, 99 (26), 17020 – 17024 PMID 12471157
- Touchon M, Hoede C, Tenaillon O, Barbe V, Baeriswyl S, Bidet P, Bingen E, Bonacorsi S, Bouchier C, Bouvet O, Calteau A, Chiapello H, Clermont O, Cruveiller S, Danchin A, Diard M, Dossat C, Karoui ME, Frapy E, Garry L, Ghigo JM, Gilles AM, Johnson J, Le Bouguénec C, Lescat M, Mangenot S, Martinez-Jéhanne V, Matic I, Nassif X, Oztas S, Petit MA, Pichon C, Rouy Z, Ruf CS, Schneider D, Tourret J, Vacherie B, Vallenet D, Médigue C, Rocha EP, Denamur E. 2009. Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet, 5 (1), e1000344 PMID 19165319
- Bachmann, BJ. 1996. Derivations and Genotypes of Some Mutant Derivatives of Escherichia coli K-12. in: Escherichia coli and Salmonella, 2nd Ed. p 2460 – 2488. ASM Press (out of print) ISBN-13: 978-1555810849 (textbooks.com)
- Cleary JP, Beard PJ, Clifton CE. 1935. Studies of Certain Factors Influencing the Size of Bacterial Populations. J Bacteriol, 29 (2), 205 – 213 (1935). PMID 16559778
- Johnston, M. et al. Joshua Lederberg on Bacterial Recombination. Genetics 203, 613 – 614 (2016). http://www.ncbi.nlm.nih.gov/pubmed/27270693
- Tatum, E. L. & Lederberg, J. Gene Recombination in the Bacterium Escherichia coli. J. Bacteriol. 53, 673 – 684 (1947). http://www.ncbi.nlm.nih.gov/pubmed/16561324
- Bachmann, B. J. Pedigrees of some mutant strains of Escherichia coli K-12. Bacteriol. Rev. 36, 525–57 (1972). http://www.ncbi.nlm.nih.gov/pubmed/4568763
- Lederberg J. 2004. E. coli K-12. Microbiology Today 31, 116.
- Hayashi K, Morooka N, Yamamoto Y, Fujita K, Isono K, Choi S, Ohtsubo E, Baba T, Wanner BL, Mori H, Horiuchi T. 2006. Highly accurate genome sequences of Escherichia coli K-12 strains MG1655 and W3110. Mol Syst Biol, 2, 2006.0007 PMID 16738553
- Blount ZD, Barrick JE, Davidson CJ, Lenski RE. 2012. Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature, 489 (7417), 513 – 518 PMID 22992527
- Edwards RJ, Sockett RE, Brookfield JF. 2002. A simple method for genome-wide screening for advantageous insertions of mobile DNAs in Escherichia coli. Curr Biol, 12 (10), 863 – 867 PMID 12015126
- Ehrlich GD, Ahmed A, Earl J, Hiller NL, Costerton JW, Stoodley P, Post JC, DeMeo P, Hu FZ. 2010. The distributed genome hypothesis as a rubric for understanding evolution in situ during chronic bacterial biofilm infectious processes. FEMS Immunol Med Microbiol, 59 (3), 269 – 279 PMID 20618850
Brooke Anderson is a graduate student in the lab of Rachel Dutton at UC San Diego. She enjoys basking in her luck to work with the tastiest of microbiological subjects, cheese.