Spot 42 RNA and the small protein SpfR was also a topic in the recent TWiM #262 podcast.
I recently mentioned in passing that "...one is poking a wasps' nest when studying cAMP·CRP regulated genes." Are you ready to do some poking with me? Well, in a back corner of said wasp's nest lies a twisted nails puzzle, the solution to which I will reveal here. You would be right to suspect that a twisted nails puzzle can only come from E. coli's toy box of small RNAs (into which Matthias Gimpel let us peek here).
To begin with, let me remind you that these 'small RNAs' are by no means small, lightweight molecules. The sRNA in question here, Spot 42 RNA, has a length of 109 bases, corresponding to a molecular weight of 34.9 kDa. It is, when properly folded, more bulky than the 210 amino acid-long monomer of CRP (also known as Crp, or CAP) with a molecular weight of 23.7 kDa. CRP and its active form, the dimeric cAMP·CRP complex, will also play a role in the following, as will the crosstalk between both Spot 42 and CRP.
The first nail In antiquity, RNAs and modified nucleotides were named after their positions on gels or 2D‑fingerprints. Think of (p)ppGpp that triggers the stringent response and which was initially termed 'magic spot'. Likewise, Spot 42 RNA was detected way back in 1973 by two-dimensional (2D) gel electrophoresis as a moderately abundant non‑ribosomal RNA of E. coli when cells were labeled briefly with [32P]‑phosphate. It was found later that Spot 42 RNA (Figure 1) accumulates in E. coli cells grown in glucose‑medium (~200 copies/cell, half life ~12 min at 37°C), but not in cells grown on succinate as carbon source or when cAMP was added to the growth medium. Spot 42 RNA is transcribed from the spf (spot forty-two) gene, whose expression is negatively regulated by cAMP·CRP (the cAMP·CRP binding site is highlighted in light gray in Figure 2). In retrospect, these were the first hints that Spot 42 RNA is in some way involved in carbon catabolite repression (CCR), the wasps' nest. And it is the first nail of today's puzzle.
When E. coli cells run low on glucose during growth, they activate via cAMP·CRP the expression of genes for so-called non-PTS sugars, that is, sugars that are not imported via the phosphotransferase system, including galactose, among others. In the gal operon, the downstream promoter P1 and the upstream promoter P2 are both negatively modulated by the GalR repressor, which in turn is regulated in its activity by the presence/absence of galactose. cAMP·CRP stimulates transcription from P1 and inhibits transcription from P2. In principle, this is the same regulation as in the lac operon, but more complicated because the two promoters each have their own operator. In any case, the resulting polycistronic galETKM mRNA is not continuously translated, that is, the relative synthesis of the enzymes GalE, GalT, and GalK varies under different metabolic conditions.
Møller et al. (2002) "nailed" one culprit for this so-called 'discoordinate expression' of the galETKM mRNA as they found that Spot 42 RNA covers the galK translational initiation region via (complementary) base pairing, thus preventing its translation (see Figure 1). Galactokinase, GalK, is the enzyme that funnels galactose into glycolysis (Leloir pathway), and blocking its synthesis makes sense as long as cells have a sufficient supply of their preferred sugar, glucose. Thus, cAMP·CRP represses transcription of spf (Spot 42) but activates expression of the gal operon while, in turn, Spot 42 prevents translation of galK mRNA. This counteracting – and in a way counter‑intuitive – regulation ensures that E. coli uses a depleting glucose pool for as long as possible before then rapidly switching to galactose utilization. Clever.
To make things just a tad more complicated still, Spot 42 RNA needs the RNA chaperone Hfq to efficiently silence the translation of its target mRNAs (see Figure 1 for the Hfq binding sites). This is a double-edged sword – and a real challenge for systems biologists who model networks – because Hfq both binds and stabilizes various sRNAs and simultaneously exposes them to degradation by RNase E.
The twisted nails puzzle It was noticed early on that an open reading frame (ORF) coding for a 15 amino acid-long protein is completely contained within the Spot 42 RNA sequence. That's why I came up with calling it a twisted nails puzzle (Figures 1+2). In a recent paper, Aoyama et al. (2022) from Gisela Storz's lab at the NIH, Bethesda MD got neatly around the semantic cliff of whether these 15 amino acids are an oligopeptide or a protein by succinctly calling it SpfP (spot forty-two protein). And they showed that it is indeed translated, which was controversial before. Proof came from applying the FLAG-tag approach, that is, by affinity‑tagging the ORF in spf and subsequent detection of a polypeptide of the expected size in immunoblots of lysates from cells chromosomally expressing the fusion gene.
Now, how could they separate the protein-coding activity of spf from its base-pairing activities? For functional tests, the authors constructed four spf variants, the first being the wild-type spf gene ('SpfR·SpfP' in the following to avoid their somewhat character-rich nomenclature.) In the second variant, a sense codon in the small ORF was converted into stop codon (UUA>UGA) to prevent its full translation ('SpfR·SpfP'). For variant three, the ORF was re‑coded as to allow its translation with identical amino acids but impairing the formation of the stemloop structure of SpfR while leaving other regions for base-pairing to target mRNAs intact ('SpfR·SpfP', see Figures 1+2). In the fourth variant derived from the 'SpfR·SpfP' construct, a stop codon in the ORF prevents its full translation ('SpfR·SpfP'). All four spf variants were over‑expressed from plasmids in a Δspf strain background with various reporter genes.
From their results, I pick here one particularly revealing example. They over‑expressed the spf variants in a Δspf strain background and challenged cells pre-grown in rich medium with glucose for growth in minimal-medium with galactose as sole carbon source. Note that for growth these cells need to activate the transcription of the gal operon by cAMP·CRP while its repression by GalR would already be relieved by the galactose present in the medium. Repression of spf by cAMP·CRP would not play a role here since the variants were expressed from a plasmid promoter. Additionally, the cells would have to overcome the block of galK mRNA translation by Spot 42 RNA (see above). The authors found that in galactose-grown cells, over‑expression of wild-type Spot 42 ('SpfR·SpfP') produces a severe growth defect ─ in fact, just above "no growth" given their 16-hour incubation ─ that persists following introduction of a stop codon in 'SpfR·SpfP', showing that the sRNA itself confers a phenotype (Figure 3). In contrast, the stop codon mutation 'SpfR·SpfP' eliminated this growth phenotype associated with 'SpfR·SpfP' overexpression. Together, these data show that spf functions as both an sRNA and mRNA. Could SpfP interfere with cAMP·CRP-mediated activation of the gal operon?
In a series of experiments with differentially 'tagged' and in vivo expressed SpfP and CRP variants, Aoyama et al. (2022) found that SpfP indeed associates with CRP (too complicated to recap this in detail here.) And this is clearly not just a 'protein─protein interaction' as this association is tight enough that the small SpfP protein remained bound to CRP during SDS-PAGE, that is, during gel electrophoresis of denatured proteins. Looking into this association in more detail by directed mutagenesis, they found that H10 of SpfP is crucial for the interaction with CRP. In CRP, conversely, the L51M, L62F, and S84R mutations are individually sufficient to suppress interaction with SpfR. In the known CRP 3D-structure, residues L51, L62, and S84 are clustered near a secondary cAMP-binding site and the 'activating region 3' (AR3). Thus, SpfP binding to CRP can impede the activation of promoters that depend on the cAMP·CRP contact with the α-subunit of RNA polymerase. The authors detected in the immunoblots for whole-cell lysates a similarly tight association of SpfP with another protein, larger than CRP, which they did not yet identify. So, the interaction/association with CRP is the first known function of this tiny protein, SpfP (formally a 'quindecim peptide', ha!)
With Spot 42 demonstrated to be a dual-function sRNA and the functions of SpfR and SpfP nailed, is the twisted nails puzzle untwisted? Almost, but I'll leave the puzzle in the wasps' nest at this point. Two more twists will be the topics of an upcoming post.