by Christoph
The concept that bacteriophages are HGT agents is old hat, a 90-year-old hat to be precise*). I mention just two well‑studied cases: the nasty symptoms of cholera and of bacillary dysentery in humans are triggered by toxins produced by the causative bacteria, Vibrio cholerae and Shigella dysenteriae, respectively. Yet in both cases the genes for toxin synthesis are found in resident prophages, that is, temperate phage genomes integrated in the chromosomes of the host bacteria: the CTXφ prophage in case of V. cholerae, and the Stx prophage in the case of several Shigella species and E. coli O157:H7 (EHEC) (CTXφ and Stx are not related nor are the toxins they encode ). Take a bacterial human pathogen and chances are good that you will find virulence factors whose genes were acquired by formerly commensal bacteria via HGT, often transferred by bacteriophages.
Along the banks of Helicase River
Click to enlarge)
Figure 1. (mono-genic) DnaA tree for 54 Enterobacterales with one representative species for each genus from every family; V. harveyi DnaA (Vibrionales) was used as outgroup. Grey dots indicate genomes containing the dnaTC gene pair; species names in square brackets: unfinished genomes with known dnaA genes but uncertainty with respect to the presence of a dnaTC gene pair. DnaA sequences were obtained from the NCBI protein database. Alignment of DnaA domains 3 + 4 (gap open=10; gap extension=1.0) and tree building (Neighbor Joining, Jukes-Cantor, 1,000 replicates) were done with the CLC SequenceViewer 7.7.1 software. By the author (2017)
Less well known is the fact that some bacteria recruited proprietary phage genes for their own 'operating systems'. I illustrate this with an example involving the process of DNA replication initiation and replication restart. For replication in Escherichia coli to commence, the DNA‑bound initiator protein DnaA directs a tight complex of the replicative helicase, DnaB, with the helicase loader, DnaC, to a small unwound region within the replication origin, oriC. After ejecting DnaC from the complex, the DnaB helicase calls the primase, DnaG, to action on the single-stranded DNA, and then rapidly propels forward unwinding dsDNA at a velocity of ~550 bp/s. After assembling at the primer laid down by primase, a fully-blown replisome follows suit (virtually pushing the helicase in front to 'speed it up!' ). When a replisome stalls due to a road block (damaged DNA, for example, or heavy transcription in the opposite direction) and the replication fork eventually collapses, replication restarts by reloading of the DnaB·DnaC helicase complex. This requires a different type of primosome than for initiation, which, in the case of E. coli, involves DnaT together with a bunch of other proteins. Again, once DnaC is ejected from the DnaB·DnaC complex, replication resumes stepwise just as during initiation. Two of the replication proteins mentioned, DnaT and DnaC, are encoded by the dnaTC gene pair in E. coli. As Weigel & Seitz observed earlier (and still earlier Daubier & Ochman ), their homology to various initiator/helicase loader gene pairs of lambdoid phages suggests they were acquired by HGT. The homology is particularly striking for the E. coli Rac and Salmonella Gifsy-1 prophages (see here for a 'visual summary' of the nested homologies ). Support for the hypothesis of E. coli's acquisition of the dnaTC gene pair via HGT comes from the observation that homologous gene pairs are exclusively found in the Enterobacterales order of the Gammaproteobacteria (no one knows which helicase loader the pseudomonads or vibrios employ, for example, or if they employ one at all ). And, within the Enterobacterales, the presence of a dnaTC gene pair is, with few exceptions, confined to the families Enterobacterialceae, Hafniaceae, and Erwiniaceae (Figure 3) (note that in Fig. 3, I use a DnaA tree as proxy for a species tree but the 'taxonomic sorting' is nevertheless fairly good ). E. coli and Salmonella enterica, two Enterobacteriaceae with a dnaTC gene pair, had their last common ancestor ~108 years ago. But just when the founder of the 3 affected families that picked up the dnaTC gene pair diverged is anybody's guess. It nevertheless seems that this HGT is, on an evolutionary time scale, a 'very recent' event and would thus belong to the 'red layer' of Figure 2 D in part 1 (maybe with a bluish hue ). Briefly looking back to the valley of astonishing trees (part 1 of the Tour d'Horizon), the persistence of the dnaTC gene pair in three families within the Enterobacterales is in line with the notion of Dagan et al. that genes, once acquired by a genome via HGT, tend to be kept.
The jolly exchange of genes encoding replication factors among phages and their bacterial hosts doesn't stop here. Genes for the helicase loader proteins DnaI, DnaB and DnaD of Bacillus subtilis and various other members of the Firmicutes also show signs of a having a phage origin, with ancestors of Listeria phage φA118 and Streptococcus phage φSM1 as possible sources (the Firmicutes people use a different terminology, so there is often no correspondence to the names of E. coli genes/proteins ). In the reverse direction of gene transfer, numerous phages of Gram-positive and Gram-negative hosts picked up the bacterial E. coli DnaB‑type replicative helicase of their hosts, while either keeping their cognate DnaC-type helicase loader (φP27, φSPP1) or dismissing a loader altogether (φP22, φSf6, φ11, φ3626). Not actually a case for HGT, but Lemonnier et al. demonstrated that the φP1 Ban protein, a DnaB-type helicase, can functionally substitute for a mutant DnaB in E. coli. This observation would make digging into the phylogenetic archaeology of DnaB proteins fun: to look for the replacement of compromised dnaB genes in bacteria by the homologous phage genes. Just imagine how the barely penetrable jungle in Figure 2 of part 1 would look once the contributions of phage genes were included (in my opinion: they deserve to be included )!
*) Frobisher M, Brown J. 1927. Transmissible toxicogenicity of Streptococci. Bull Johns Hopkins Hosp, 41, 167 – 173; Source
Comments