Not all bacterial endosymbionts come from the "Buchnera branch" of the Gammaproteobacteria...
...and not only insects harbor bacterial endosymbionts. When you've followed our posts on symbioses between animals and bacteria over the last two years or so, you could have easily got the impression that they occur mainly between insects and Gammaproteobacteria from the order Enterobacterales of this large and diverse phylum.1) This impression would be skewed but only because our choice of topics was! Margaret McFall-Ngai and colleagues posited in their PNAS Perspective from 2013 that "...all biologists will be challenged ... to include investigations of the relationships between and among bacteria and their animal partners as we seek a better understanding of the natural world." 2) So, already a decade ago we knew well that animal–bacterial symbioses involve various animals (metazoa), not just insects. And for the insects, symbiotic bacteria from a number of different phyla (see Fig. 1).
Aware of our skewed choice of "symbiotic topics", and with the intention to broaden the perspective, Elio featured in his recent post the symbiosis of the mouth- and gutless catenulid flatworm Paracatenula polyhymnia with its thiotrophic (sulfide-oxidizing) endosymbiont 'Candidatus Riegeria santandreae', an Alphaproteobacterium from the taxonomic family Rhodospirillaceae. I will chime in here with two aspects: endosymbiont genome sizes and the evolutionary history of symbioses.
"Extreme" genome shrinkage is not always the case among bacterial endosymbionts
Jäckle et al. state in the abstract of their paper that the Riegeria santandreae endosymbiont (Alphaproteobacteria, order Rhodospirillales) of the Paracatenula santandrea flatworm "has a drastically smaller genome (1.34 Mb) than the symbiont's free-living relatives (4.29 – 4.97 Mb)". To put the rather dramatic-sounding "drastically" into a quantitative perspective, it is worth recalling that genome sizes of ~1.5 Mb are not that uncommon among free-living bacteria from different phyla. Here are four examples. The free-living marine Alphaproteobacterium Pelagibacter ubique HTCC1062 (order Pelagibacterales) has a genome size of 1.3 Mb (1,354 ORFs). The free-living marine Deltaproteobacterium Hippea maritima MH2 has a genome size of 1.69 Mb (1,723 ORFs). The genome sizes of Prochlorococcus marinus strains (Cyanobacteria) range from 1.6 to 2.7 Mb (core genome: ~1,250 ORFs). The Betaproteobacterium Polynucleobacter necessarius is a symbiont of Euplotes ciliates that cannot be cultivated separate from its host. It's genome is 1.56 Mb in size (~1,279 ORFs), which is not much different from that of its free-living sibling P. necessarius strain QLW-P1DMWA-1 (2.2 Mb, 2,075 ORFs). The former has not only fewer genes but also a higher percentage of pseudogenes, that is, orthologs of genes from the latter rendered non-functional by randomly interspersed stop codon(s) and/or small deletions likely caused by a "replication slippage" mechanism. So, bacteria of different phyla can apparently sustain of a free-living lifestyle with genome sizes of 1.5 Mb – 2.0 Mb corresponding to a coding capacity for ~1,500 – 2,000 proteins. These genomes are rightly considered "streamlined" as it was shown for Pelagibacter, Prochlorococcus, and for the free-living P. necessarius strain QLW-P1DMWA-1 that these species are particularly well adapted to their rather static oligotrophic habitats but lack the metabolic versatility of relatives with larger genomes.
The notion among microbiologists of a "normal sized" bacterial genome was – and probably still is – sort-of calibrated on model organisms like E. coli K-12 MG1655 (4.64 Mb, 4,242 ORFs) or B. subtilis 168 (4.22 Mb, 4,237 ORFs) that are, when set on a whole range of different diets, capable of a free-living lifestyle as planktonic cells in Erlenmeyer flasks. Yet, life outside the lab is even more varied, and so are genome sizes in bacteria. They range from >10 Mb as for Sorangium cellulosum (14.8 Mb, 11,599 ORFs) down to <0.2 Mb as for Carsonella ruddii PV (0.16 Mb, 179 ORFs).3) With the E. coli genome as "yardstick", any genome larger than 7 Mb could be considered overblown, and any smaller than 3 Mb as reduced. However, it is probably more meaningful to ask which size of its "tool box", that is, which number of genes – and which genes, of course, but that's more complicated – qualifies a bacterium for a free-living lifestyle. And indeed, there seems to exist a somewhat fuzzy threshold of ~1.5 Mb (~1,500 genes, RNA and ORFs) for genome sizes below which bacteria were so far only found either as "free-living obligate symbionts" or as endosymbionts. "Free-living obligate symbionts" are the extracellular Stammera capleta symbionts (0.27 Mb, 251 ORFs) of thistle tortoise beetles, and the "Lilliputians" from the Candidate Phyla Radiation (CPR) with genome sizes of ~1Mb. Reconstructed genomes for a small number of species from this large phylum are characteristically lacking one or several biosynthetic pathways, different ones in different genomes, which renders them auxotrophic. It is presently not known whether the "Lilliputians" thrive as reciprocally complementing and thus self-sufficient consortia, or depend on "larger" prototrophic hosts, or both. Similarly, all known endosymbionts and obligate intracellular pathogens with genome sizes of approximately 1.0 Mb or smaller are metabolically deficient and depend on their hosts for support, think of Mycoplasma genitalium (0.58 Mb, 515 ORFs) from the phylum Tenericutes. It is appropriate to call their genomes "reduced". This applies, of course, also to the Buchnera (Enterobacterales) endosymbionts of aphids. Nancy Moran's lab found in a very recent study that the genomes of 39 strains isolated from different aphid lineages vary in size from 0.41 Mb to 0.65 Mb (354 to 587 ORFs), and that after 180 Mya of symbiosis between progenitors of the aphids and Buchnera genome reduction has not come to an end. "Extremely reduced" are the Nardonella genomes (~0.2 Mb, ~200 ORFs) that resemble a recipe for an 'in vivo coupled-transcription + translation kit' for tyrosine.
Taken together, the claim that Riegeria santandreae has "a drastically reduced genome" appears to me somewhat overstated. What can hardly be overstated, though, is the apparent stability – maybe better: resistance to genome shrinkage – of its genome on the evolutionary time scale. Interestingly, the same seems to hold true for 'Candidatus Endolissoclinum faulkneri', another Alphaproteobacterium from the family Rhodospirillaceae, and endosymbiont of Lissoclinum patella (Tunicata). The two strains of 'Ca. E. faulkneri' that were studied by Kwan and Schmidt have genome sizes of 1.48 Mb and 1.51 Mb, respectively, but with very low coding density (772/778 ORFs) but few recognizable pseudogenes. As said above, the size and content of the "tool box" is more meaningful than the genome size in mega-basepairs.
Some symbioses are astonishingly old
Concerning the stability of the Riegeria genome on the evolutionary time scale: when Elio first opened a Wormful of Bugs, he already mentioned the estimated age of the Paracatenula–Riegeria symbiosis of 500 to 620 million years (no discussion here of the rather wide error margin). The onset of this symbiosis would thus date to the middle of the "Cambrian explosion" (541 Mya ago), that is, within or close to the comparatively short period of 5 – 10 million years (Mya) in which most of the important animal phyla first appear in the fossil record. This makes it one of the oldest known animal-bacteria symbioses, and which, since we can study it today, managed to survive the five major extinction events since the Cambrian. It has sometimes been argued that the endosymbiotic lifestyle is an evolutionary "dead end" for bacteria with reduced genomes as they would not be able to return to the "normal" free-living lifestyle. Seriously, I find it difficult to see 500+ Mya successful existence as an endosymbiont (with a reasonably-sized genome) as an evolutionary impasse for a bacterial species. But what is more fascinating than the mere age of this symbiosis is the finding that apparently animal-bacteria symbioses arose again and again over the last 500 Mya – there are no "hot spots" on the timescale that would suggest particular larger-scale environmental conditions favoring their establishment. And, consequently, we find today, next to tenured symbioses, decaying symbioses as well as symbioses that seem to be "underway", or such that were established more recently. While writing this, Charles Darwin's "tangled banks" came to my mind:"...so simple a beginning endless forms most beautiful and most wonderful have been, and are being, evolved."
1) There are presently 725 reference-quality genomes for the Gammaproteobacteria in the RefSec collection (04/19), next to, for example, >5×10^4 different E. coli strain/lineage genomes of varying coverage reflecting this species' pan-genome.
2) Elio referred to the McFall-Ngai et al. paper when he asked six years ago: Who's Planet Is It Anyway?, well worth re-reading!
3) According to Elio, Carsonella ruddii is: The Bacterium That Doesn't Know How To Tie Its Own Shoelaces.