Download Relationship between codon biased genes, microarray expression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Epitranscriptome wikipedia , lookup

Biosynthesis wikipedia , lookup

Lac operon wikipedia , lookup

RNA interference wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Gene desert wikipedia , lookup

Molecular ecology wikipedia , lookup

Expression vector wikipedia , lookup

Point mutation wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Community fingerprinting wikipedia , lookup

Gene expression wikipedia , lookup

Gene regulatory network wikipedia , lookup

Genetic code wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

RNA-Seq wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene wikipedia , lookup

Ridge (biology) wikipedia , lookup

Transcript
Microbiology (2004), 150, 2313–2325
DOI 10.1099/mic.0.27097-0
Relationship between codon biased genes,
microarray expression values and physiological
characteristics of Streptococcus pneumoniae
Antonio J. Martı́n-Galiano,13 Jerry M. Wells2 and Adela G. de la Campa1
1
Unidad de Genética Bacteriana (CSIC), Centro Nacional de Microbiologı́a, Instituto de Salud
Carlos III, 28220, Majadahonda, Madrid, Spain
Correspondence
Antonio J. Martı́n-Galiano
2
[email protected]
Bacterial Infection and Immunity Group, Institute of Food Research, Norwich Research Park,
Norwich NR4 7UA, UK
Received 13 February 2004
Revised
26 April 2004
Accepted 28 April 2004
A codon-profile strategy was used to predict gene expression levels in Streptococcus
pneumoniae. Predicted highly expressed (PHE) genes included those encoding glycolytic and
fermentative enzymes, sugar-conversion systems and carbohydrate-transporters. Additionally,
some genes required for infection that are involved in oxidative metabolism and hydrogen peroxide
production were PHE. Low expression values were predicted for genes encoding specific
regulatory proteins like two-component systems and competence genes. Correspondence
analysis localized 484 ORFs which shared a distinctive codon profile in the right horn. These
genes had a mean G+C content (33?4 %) that was lower than the bulk of the genome coding
sequences (39?7 %), suggesting that many of them were acquired by horizontal transfer. Half
of these genes (242) were pseudogenes, ORFs shorter than 80 codons or without assigned
function. The remaining genes included several virulence factors, such as capsular genes, iga,
lytB, nanB, pspA, choline-binding proteins, and functions related to DNA acquisition, such as
restriction-modification systems and comDE. In order to compare predicted translation rate with
the relative amounts of mRNA for each gene, the codon adaptation index (CAI) values were
compared with microarray fluorescence intensity values following hybridization of labelled RNA
from laboratory-grown cultures. High mRNA amounts were observed in 32?5 % of PHE genes
and in 64 % of the 25 genes with the highest CAI values. However, high relative amounts of RNA
were also detected in 10?4 % of non-PHE genes, such as those encoding fatty acid metabolism
enzymes and proteases, suggesting that their expression might also be regulated at the level of
transcription or mRNA stability under the conditions tested. The effects of codon bias and
mRNA amount on different gene groups in S. pneumoniae are discussed.
INTRODUCTION
Streptococcus pneumoniae, commonly known as the pneumococcus, is one of the most important human pathogens
worldwide, causing a number of diseases including
pneumonia, meningitis, otitis media and sinusitis. The
increasing number of clinical isolates found to be antibioticresistant (and multidrug-resistant) highlights the importance
3Present address: Lehrstuhl für Genomorientierte Bioinformatik,
Wissenschaftszentrum Weihenstephan, Am Forum 1, 85354 Freising,
Germany.
Abbreviations: CAI, codon adaptation index; COA, correspondence
analysis; FU, fluorescence units; Nc, effective number of codons; PHE,
predicted highly expressed; RP, ribosomal protein; RSCU, relative
synonymous codon usage.
CAI values for all genes of strain TIGR4 are available as supplementary
data with the online version of this paper at http://mic.sgmjournals.org.
0002-7097 G 2004 SGM
of research on the molecular biology of this organism. The
availability of genome sequence data for S. pneumoniae
strain JNR7/87 (TIGR4) of serotype 4 (Tettelin et al., 2001),
strain R6 (an unencapsulated laboratory derivative of a
serotype 2 strain) (Hoskins et al., 2001), and strain G54
serotype 19F strain (Dopazo et al., 2001), provides a wealth
of untapped information with which to analyse codon
usage and its relationship to gene expression and mutational
bias. Besides other mechanisms, codon bias can influence
gene expression by optimization of the translation rate
(Chavancy & Garel, 1981). It is based on the selection of
the third codon position to adapt coding sequences to the
most abundant tRNAs in the cell (Ikemura, 1981) or to
those with more efficient codon–anticodon interaction
kinetics (Grosjean et al., 1978). Although this gene adaptation is species-specific, close similarities can be found in
organisms of the same genus (Sharp, 1991). Highly restrictive codon patterns exist in genes encoding abundant
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Printed in Great Britain
2313
A. J. Martı́n-Galiano, J. M. Wells and A. G. de la Campa
polypeptides, probably due to a low tolerance to synonymous substitutions that slow down the translation
elongation process (Sharp & Li, 1987b).
The approximate expression level of a gene can be predicted by comparing its codon bias with the profile of
universally highly expressed genes, such as the ribosomal
protein (RP) genes, which are commonly used as a reference set. Algorithms developed for this purpose (Sharp &
Li, 1987a; Karlin & Mrazek, 2000) are adequate for
deciphering the general pattern of gene expression in the
cell, and to detect special enhanced functions in some
micro-organisms, such as DNA and protein repair in
Deinococcus radiodurans and flagellar motility in Treponema pallidum (Karlin & Mrazek, 2000). There is a good
correlation of predicted highly expressed (PHE) genes with
high two-dimensional gel abundances in Bacillus subtilis
and Escherichia coli (Karlin et al., 2001). However, these
algorithms do not allow the detection of genes encoding
proteins that are abundant due to their high stability
rather than to a high translation rate (Karlin et al., 2001)
and, given the large translation capacity of ribosomes,
codon usage restrictions of highly expressed genes should
operate only at critical stages of rapid growth (Kurland,
1991). In accordance with these ideas, the slow-growing
Mycobacterium tuberculosis (24–36 h doubling time) exhibits almost no alternative codon bias among genes that are
PHE in other, fast-growing eubacteria (Andersson & Sharp,
1996; Karlin & Mrazek, 2000).
Codon bias could be an important factor in S. pneumoniae
since its cell-division time under laboratory growth conditions is typically less than 45 min. However, to the best of
our knowledge, systematic studies of the effect of codon
usage on gene expression levels and gene function have
not been reported for the lactic acid group of bacteria. In
addition, there is one report on the correlations between
codon usage bias and microarray data for E. coli (dos Reis
et al., 2003). Given the medical significance of S. pneumoniae, Streptococcus pyogenes, and the viridans group
streptococci, and the industrial importance of the food
lactic acid bacteria, such as Lactococcus lactis and Lactobacillus acidophilus, a study of the relationship between
codon usage, gene expression and gene function is
required. The objective of this study was to analyse the
relationships between the predicted level of gene expression
based on codon usage, actual microarray expression values
and gene function at the genomic level in S. pneumoniae.
METHODS
Synonymous codon usage and statistical analysis. The geno-
mic sequences of S. pneumoniae strain JNR7/87 (TIGR4; Tettelin
et al., 2001) and strain R6 (Hoskins et al., 2001) were obtained from
the The Institute for Genomic Research (TIGR, http://www.tigr.org).
Three parameters were calculated, essentially according to the
method of Sharp and Li (1987a): RSCU (relative synonymous codon
usage), w (relative adaptiveness of a codon) and CAI (codon adaptation index). An RSCU value for a codon is the observed frequency
2314
of a codon divided by the expected frequency when all synonymous
codons for that amino acid are used equally. Therefore, RSCU
values close to 1?0 indicate a lack of bias for that codon. w is a normalized version of RSCU, calculated as the quotient of the RSCU
value of a specific codon and the highest RSCU value for codons
encoding the same amino acid. The CAI value of a gene is the
geometric mean of the w values from all its codons. A w value of
0?001 was assigned to codons never used in the reference set to
avoid CAI values of 0 for genes having those codons. CAI values
for all genes of strain TIGR4 are available as supplementary data
with the online version of this paper (http://mic.sgmjournals.org).
Programs for calculating CAI and the effective number of codons
(Nc) values were written in Visual Basic. Correspondence analysis
(COA) of RSCU values was performed using the GCUA program
(available at http://bioinf.may.ie/gcua/download.html; McInerney,
1998). Briefly, this method plots genes according to the codon usage
in a 59-dimensional space (not including the five non-variant
codons), and then identifies the major trends in codon usage as
those axes through this multidimensional hyperspace which account
for the largest fractions of the variation among genes.
Culture conditions, RNA extraction and microarray experiments. S. pneumoniae R6 was grown in Todd–Hewitt medium
(Difco) with 0?5 % yeast extract, adjusted to pH 7?8 (THYE
medium). Cells corresponding to 50 ml cultures were collected at
mid-exponential phase (OD620=0?25), washed with cold 0?9 %
NaCl and stored at 280 uC. Pellets were thawed and cells lysed for
15 min at 37 uC in 10 mM Tris, 1 mM EDTA (pH 8?0), 0?1 %
sodium deoxycholate. RNA was extracted with the RNeasy midi kit
(QIAGEN), including a DNase treatment according to the manufacturer’s instructions, precipitated with ethanol, washed, and suspended in 40 ml H2O. Concentration and purity of the RNA samples
were measured using the 2100 Bioanalyser (Agilent). Details of the
construction of the microarrays used in this study have been
described previously (Dagkessamanskaia et al., 2004). The microarrays included probes for all strain TIGR4 annotated genes (2236)
and probes for 117 R6-specific genes (i.e. less than 90 % similarity,
as deduced by BLAST analysis). To obtain labelled cDNA, a 25 ml
mixture was made with 15 mg RNA, 5 mg random primers (obtained
with the Bioprime DNA labelling kit, Invitrogen), 12 mM DTT,
500 mM each dNTP (except for CTP, which was 240 mM), 2 nM
Cy3- or Cy5-labelled CTP, and 200 units Stratascript (Stratagene)
reverse transcriptase, in the buffer supplied by the manufacturer.
The mixture was incubated overnight at 37 uC and the reaction
stopped by addition of 1?5 ml 20 mM EDTA plus 15 ml 0?1N NaOH.
After 15 min incubation at 70 uC, 15 ml 0?1N HCl was added.
Labelled cDNA was treated with the QIAquick PCR purification kit
(QIAGEN), the volume was reduced to 10 ml by lyophilization, and
then 6?1 mg Cot1 human DNA was added, as well as 36 SSC, 0?2 %
SDS, 0?02 M HEPES and 46 Denhardt’s solution, to a final volume
of 90 ml. Samples were treated for 2 min at 100 uC and 10 min at
room temperature, centrifuged twice, and 40 ml of the supernatant
was applied to a microarray slide. After overnight incubation at
63 uC, microarrays were washed and scanned with an Axon 4000A
apparatus, using GenePix Pro 3.0 software. Fluorescence values,
taken as the median of the intensity of all the pixels after subtracting
the surrounding background, corresponded to the mean of three
independent samples, each having four replicates for each gene.
RESULTS AND DISCUSSION
Multivariate statistics: correspondence analysis
To examine the codon usage heterogeneity among S.
pneumoniae genes, COA analysis of RSCU values of all
ORF genes in strain TIGR4 (2236) was performed. Scatter
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Microbiology 150
Codon bias and relative mRNA levels in S. pneumoniae
Fig. 1. A. Plot of the two first axes generated by COA of RSCU values for 2236 ORFs of S. pneumoniae TIGR4. Gene
function symbols in (B): *, RP genes; ,, genes with less than 80 codons; $, degenerated and truncated genes. Symbols in
(C): #, PHE genes; %, pathogenicity genes; +, modification-restriction enzyme genes; ., two-component systems; &, DNA
transformation. Symbols in (D): &, lagging-strand genes; dots, leading-strand genes.
plots revealed a core region and two ascending horns, as
reported previously for other eubacteria, such as E. coli
(Médigue et al., 1991). The left horn was less dispersed than
the right one (Fig. 1A). A total of 484 ORFs localized in
the right horn (axis 1 values >0 and axis 2 values >0?02).
Half of these genes (242 of 484) were pseudogenes (usually
transposases), were shorter than 80 codons, or encoded
unassigned hypothetical proteins (Fig. 1B). A total of 242
functional genes were present in the right horn (Fig. 1C),
including several genes encoding phosphotransferase
systems, restriction-modification systems, choline-binding
proteins, competence proteins and most genes of the blp
http://mic.sgmjournals.org
operon (related to toxin production). These genes associated with the right horn are potentially foreign genes
acquired by horizontal transfer, which have not yet evolved
a codon profile matched to the translation machinery of
S. pneumoniae. They had a mean G+C content (33?4 %)
lower than that of the coding sequences of the whole
genome (39?7 %). Most of the PHE (see below) and RP
genes were localized in the left horn (Fig. 1B, C), indicating
that they share a similar codon bias that is rather different
from the rest of the ORFs. In the first two COA axes,
at least, no significant differences in codon usage were
observed, independently of whether or not the coding
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
2315
A. J. Martı́n-Galiano, J. M. Wells and A. G. de la Campa
sequence was complementary to the leading (79 % of the
genes) or lagging strand (21 % of the genes) (Fig. 1D).
Synonymous codon usage: CAI-value
calculations
Given the similarity in the correspondence analysis plots
for E. coli (Médigue et al., 1991) and S. pneumoniae
(Fig. 1A), we assumed that highly expressed genes in
S. pneumoniae would have a codon-usage bias which was
positively correlated with the abundance of the isoacceptor
tRNA levels, as occurs in E. coli (Ikemura, 1981). For the
construction of a reference table of w values, 52 of the 56
RP genes were chosen, the four excluded RP genes (prmA,
rpsN, sp0555 and sp0973) having a different codon profile
(CAI<0?5). In accordance with the high A+T content
(60 %) of the genome (Tettelin et al., 2001), A- and Uending codons were favoured in both gene sets. There was
also a selection for A- and C-ending codons in amino
acids encoded by two codons (Phe, Tyr, His, Gln, Asn,
Lys, Asp, Glu) or three codons (Ile). Considering the only
type of tRNA detected for these nine amino acids (or the
most represented in the case of Lys, Table 1), the results
suggested a selection for a codon–anticodon interaction
without wobble. While 21 out of 61 codons in the RP set
of highly expressed genes had a codon usage bias and w
values below 0?1 (10-fold less than the preferred isocodon),
only one codon in the data for the whole genome set had
a w value less than 0?1 (Table 1).
The CAI algorithm was applied to 1802 non-RP full-length
gene sequences from the TIGR4 strain, all with between
80 and 1500 codons (not including the stop codon). The
distribution of CAI values (0?156–0?866) was unimodal,
with the majority of genes (78?4 %) having CAI values
between 0?200 and 0?400 (Fig. 2A), and the mean and
median CAI values were 0?338 and 0?312, respectively. The
CAI value was found to be independent of gene sequence
length (r2=0?0003, Fig. 2B), suggesting that codon bias
is not a major mechanism directed towards the efficient
translation of long genes. Genes were classified into three
groups with high (CAI>0?500), medium (0?500>CAI
>0?250) and low (CAI<0?250) levels of predicted
expression. The PHE genes represented 7?3 % (131 genes)
of the total, a figure compatible with those found (4–10 %)
in other eubacteria (Karlin & Mrazek, 2000). Predicted
medium- and lowly-expressed genes represented 78?7 %
(1419 genes) and 14?0 % (252 genes) of the total, respectively. The 131 PHE genes were grouped into functional
classes and subclasses (Table 2). As in other fast-growing
bacteria (Karlin et al., 2001), genes of glycolytic enzymes
and translation elongation factors (Table 2, Fig. 2B) were
among the 25 genes with the highest CAI values. Of the
10 most abundantly expressed proteins in Streptococcus
mutans, which is phylogenetically close to S. pneumoniae
(Wilkins et al., 2002), eight homologues are found in the
group of the 25 genes with the highest CAI values in
S. pneumoniae (Table 2), and the remaining two are
encoded by RP genes.
2316
PHE genes are expected to use a small number of different
codons. This value, known as the Nc variable (Wright, 1990)
can have values from 20 (when one codon is exclusively
used for each amino acid) to 61 (when the use of alternative synonymous codons is equally likely). Analysis of
the 1802 TIGR4 genes revealed Nc values ranging from 26
to 61. On average, genes with CAI>0?6 had Nc values
13 units lower than genes with CAI<0?210, and 9?5 units
lower than genes of the whole genome set of the same
length (data not shown).
Comparison of CAI and microarray fluorescence
values
As 85 % of the genes of the R6 and TIGR4 strains have a
similarity above 90 %, and a good correlation (r2=0?99) of
CAI values among their homologous genes was observed
(data not shown), Cy3- (two replicates) and Cy5- (one
replicate) labelled cDNA obtained from R6 grown to midexponential phase (OD620=0?25) was hybridized to the
microarrays, as described in Methods, and the mean
fluorescence measurements for each gene were used to
estimate the relative mRNA transcript levels. Fluorescence
was detected for 1513 homologues of R6 and TIGR4. Given
the median (1675 FU, fluorescence units) of the fluorescence distribution, and the proportion (12?56 %, 190 of
1513) of genes with values higher than 6000 FU (Fig. 3A),
that value was chosen as the cut-off to assign highly
expressed genes. Among the 114 PHE genes (CAI>0?5),
32?5 % showed high (>6000 FU), 33?3 % medium
(2000–6000 FU), and 34?2 % low (<2000 FU) relative
levels of expression (Fig. 3B). Among the 25 genes with
the highest CAI values (CAI>0?680), the majority (16 of
25, 64 %) gave high fluorescence values on the microarray,
revealing a correlation between the levels of transcription
and translation among a substantial proportion of highly
expressed genes. A similar relationship has been recently
observed in E. coli (dos Reis et al., 2003). An increase in
the proportion of genes with fluorescence values above
6000 FU was observed in groups of genes with CAI values
of 0?4 to 0?6 (21–25 %) compared to the genes with CAI
values lower than 0?4 (4–10 %). The lower median (949 FU)
and lowest percentage of genes over 6000 units (4 %)
corresponded to the group of genes with CAI values lower
than 0?2. Therefore, despite the fact that it is widely accepted
that low-abundance polypeptides do not necessarily have
low CAI values, in our experiments there was also a
relationship between CAI and FU in genes with low CAI
values. On the other hand, 10?4 % of non-PHE genes had
high fluorescence values (>6000 FU), possibly reflecting
the fact that these genes are upregulated under laboratory
culture conditions. For instance, 55 % of the fatty-acidmetabolism genes (with medium or low CAI values) had
values higher than 6000 units.
Although a general relationship was observed between
the CAI and microarray fluorescence value when all genes
were considered (Fig. 4A), a low value of r2 (0?09) was
obtained when both variables were compared. This low r2
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Microbiology 150
http://mic.sgmjournals.org
Table 1. w values for S. pneumoniae codons in the whole genome (GEN) and for ribosomal protein (RP) genes, together with the number of tRNA genes
Codon usage for the whole genome and tRNA gene data were downloaded from http://www.tigr.org.
Amino
acid
Phe
Leu
Leu
Ile
GEN
RP
tRNA
UUU
UUC
UUA
UUG
CUU
CUC
CUA
CUG
AUU
AUC
AUA
AUG
GUU
GUC
GUA
GUG
1?000
0?471
0?459
1?000
0?487
0?271
0?244
0?244
1?000
0?526
0?115
1?000
1?000
0?579
0?521
0?552
0?612
1?000
0?044
0?625
1?000
0?040
0?044
0?001
0?386
1?000
0?011
1?000
1?000
0?097
0?559
0?126
0
2
2
1
1
0
2
0
0
2
0
4
0
0
3
0
Amino
acid
Ser
Pro
Thr
Ala
Codon
GEN
RP
tRNA
UCU
UCC
UCA
UCG
CCU
CCC
CCA
CCG
ACU
ACC
ACA
ACG
GCU
GCC
GCA
GCG
1?000
0?260
0?851
0?260
0?889
0?160
1?000
0?228
1?000
0?719
1?000
0?408
1?000
0?498
0?571
0?308
0?594
0?001
1?000
0?012
0?300
0?001
1?000
0?035
1?000
0?018
0?649
0?031
1?000
0?069
0?714
0?106
0
1
2
0
0
0
2
0
0
1
2
0
0
0
4
0
Amino
acid
Tyr
Stop
Stop
His
Gln
Asn
Lys
Asp
Glu
Codon
GEN
RP
tRNA
UAU
UAC
UAA
UAG
CAU
CAC
CAA
CAG
AAU
AAC
AAA
AAG
GAU
GAC
GAA
GAG
1?000
0?450
2
2
1?000
0?496
1?000
0?612
1?000
0?493
1?000
0?753
1?000
0?515
1?000
0?612
0?263
1?000
2
2
0?228
1?000
1?000
0?016
0?236
1?000
1?000
0?082
1?000
0?773
1?000
0?111
0
2
2
2
0
1
2
0
0
2
2
1
0
2
5
0
2317
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Amino
acid
Cys
Stop
Trp
Arg
Ser
Arg
Gly
Codon
GEN
RP
tRNA
UGU
UGC
UGA
UGG
CGU
CGC
CGA
CGG
AGU
AGC
AGA
AGG
GGU
GGC
GGA
GGG
1?000
0?562
2
1?000
1?000
0?312
0?212
0?185
0?851
0?481
0?312
0?063
1?000
0?341
0?732
0?368
1?000
1?000
2
1?000
1?000
0?233
0?005
0?002
0?085
0?224
0?009
0?001
1?000
0?119
0?514
0?030
0
1
2
1
2
0
0
1
0
1
1
1
0
2
2
0
Codon bias and relative mRNA levels in S. pneumoniae
Met
Val
Codon
A. J. Martı́n-Galiano, J. M. Wells and A. G. de la Campa
Fig. 2. Distribution of CAI values (A), and
relationship between CAI and gene length
(B). Regression line in B illustrates the lack
of association between CAI and gene
length. Gene function symbols: n, glycolysis;
6, elongation factors; e, initiation factors/
aminoacyl tRNA synthetases/RNA polymerase subunits; +, chaperones; ., twocomponent systems; &, DNA transformation.
All other genes are indicated by dots.
value could be explained by the stability of the CAI
value (due to a long-term optimization to the fluctuating
environment in vivo) and the dynamic nature of the
amount of mRNA (taken from laboratory cultures growing
under defined conditions). However, there was a significant
relationship for genes of glycolysis (r2=0?46), of fatty acid
metabolism (r2=0?37), and proteases (r2=0?27). Genes
were classified in four categories: genes with CAI and
fluorescence values higher than 0?5 and 6000, respectively;
genes with CAI higher than 0?5; genes with fluorescence
values higher than 6000; and genes with CAI and fluorescence values lower than the cut-off points. Most genes
(80?6 %, 1220 out of 1513) corresponded to the last
category. In order to rule out any possible effects of
variations in probe length on the selection of genes with
a relatively high amount of mRNA transcripts, the data
shown in Fig. 3 were recalculated using FU values corrected
2318
for probe length. This did not appreciably affect the
profile of PHE genes, except in the case of the ribosomal
genes, due to their very short probe length (data not
shown). Furthermore, the use of these corrected fluorescence values did not generate any perceptible changes to
Fig. 4.
Energetic metabolism
S. pneumoniae, which has an anaerobic metabolism, lacks
the genes that encode functions of the tricarboxylic acid
cycle. Therefore, energetic metabolism relies on glycolysis
and fermentation. Accordingly, most genes of glycolytic
enzymes and two enzymes of fermentative metabolism
(ldh and pfl) were among the 25 genes with the highest
CAI values (Fig. 2B, Table 2), and also had high fluorescence values (>5700 units). Likewise, genes for alternative
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Microbiology 150
Codon bias and relative mRNA levels in S. pneumoniae
Table 2. S. pneumoniae PHE genes listed by role and subrole
ABCT, ATP-binding cassette transporter; B/D, biosynthesis and degradation; DH, dehydrogenase; DP, diphosphate; MT, methyl-transferase;
MTHPTG, methyltetrahydropteroyltriglutamate; NT, nucleotidyltransferase; P, phosphate; PEP, phosphoenol pyruvate; PMF, proton motive
force; PTS, phosphotransferase system; TF, transferase. The 25 genes with the highest CAI values are asterisked.
Role or subrole
Amino acid metabolism
Aspartate family
Glutamate family
Pyruvate family
Serine family
Cell envelope
B/D surface lipo/polysaccharides
Unknown
Cell processes
Adaptation to atypical conditions
Cell division
Detoxification
Central intermediary metabolism
Phosphorus compounds
DNA metabolism
DNA binding proteins
Energetic metabolism
Aerobic metabolism
Amino acids and amines
ATP-PMF interconversion
B/D of polysaccharides
Electron transport
Fermentation
Glycolysis/gluconeogenesis
http://mic.sgmjournals.org
Product
Gene
CAI
5-MTHPTG-homocysteine MT
Aspartate-semialdehyde DH
NADP-specific glutamate DH
Glutamine synthetase
Ketol-acid reductoisomerase
Cysteine synthase
metE
asd
gdhA
glnA
ilvC*
cysM
0?623
0?573
0?663
0?507
0?748
0?509
LysM domain protein
Lipoprotein
Lipoprotein
Pneumococcal surface protein A
sp0107
sp0149*
sp0845
pspA
0?651
0?687
0?641
0?505
General stress protein 24 kDa
Cell division protein FtsZ
Mn-superoxide dismutase
sp1804*
ftsZ
sodA
0?727
0?516
0?639
Mn-inorganic pyrophosphatase
ppaC
0?580
Chromosome binding protein HU
Single-strand binding protein
hup*
ssb
0?794
0?514
Pyruvate oxidase
Ornithine carbamoyl TF
Arginine deaminase
ATPase F0F1 b subunit
Glycogen phosphorylase
Galactose-6-P isomerase A
Tagatose-1,6-DP aldolase
Galactose-6-P isomerase B
Phosphoglucomutase
4-a-Glucanotransferase
N-acetyl-neuraminate lyase
6-Phospho-b-galactosidase
Thioredoxin
NADH oxidase
Flavodoxin
Thioredoxin reductase
Lactate DH
Formate acetyl transferase
Fe-containing alcohol DH
Zn-containing alcohol DH
Acetoin DH complex E3
Glyceraldehyde-3-P DH
Triosephosphate isomerase
Fructose-bis-P aldolase
Phosphoglycerate mutase
Enolase
Pyruvate kinase
Phosphoglycerate kinase
spxB*
argF
arcA
atpD
sp2106*
lacA
lacD
lacB
pgm
malQ
sp1329
lacG
trx*
nox
fld
trxB
ldh*
pfl*
sp2026
sp0285
sp1161
gap*
tpi*
fba*
gpmA*
eno*
pyk*
pgk*
0?738
0?639
0?518
0?547
0?706
0?671
0?665
0?665
0?648
0?597
0?561
0?500
0?700
0?669
0?544
0?510
0?782
0?748
0?616
0?563
0?529
0?866
0?830
0?824
0?819
0?815
0?749
0?737
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
2319
A. J. Martı́n-Galiano, J. M. Wells and A. G. de la Campa
Table 2. cont.
Role or subrole
Pentose phosphate pathway
Sugars
Hypothetical proteins
Unknown
Nucleotide metabolism
29-Deoxyribonucleotide metabolism
Nucleotide/nucleoside conversion
Purine ribonucleotide synthesis
Salvage nucleotides/nucleosides
Protein fate
Degradation of polypeptides
Protein folding and stabilization
Protein synthesis
Translation factors
tRNA aminoacylation
Transcription
Transcription factors
DNA-directed RNA polymerase
RNA degradation
2320
Product
Gene
CAI
Glucose-6-P isomerase
6-Phosphofructokinase
Glucokinase
6-Phosphogluconate DH
Transketolase
Fructokinase
pgi*
pfk
gki
gnd
recP
scrK
0?702
0?634
0?576
0?640
0?506
0?519
Conserved hypothetical
Conserved hypothetical
Conserved hypothetical
Conserved hypothetical
Conserved hypothetical
Unassigned hypothetical
Conserved hypothetical
Conserved hypothetical
Conserved hypothetical
Hypothetical with conserved domain
sp1197
sp0194
sp0095
sp2031
sp1882
sp2093
sp1922
sp1473
sp1102
sp1546
0?667
0?662
0?571
0?547
0?541
0?535
0?529
0?520
0?515
0?502
Ribonucleoside-DP reductase 2 a
Adenylate kinase
Uridylate kinase
GMP synthase
Inosine-5-mono-P DH
Adenylosuccinate synthetase
Uracil phosphoribosyl TF
Adenine phosphoribosyl TF
nrdE
adk
pyrH
guaA
guaB
purA
upp
apt
0?517
0?612
0?505
0?583
0?552
0?517
0?625
0?536
ATP-dependent Clp protease,
proteolytic subunit
Trigger factor
DnaK protein
Heat-shock protein GrpE
clpP
0?543
tig*
dnaK*
grpE
0?743
0?680
0?518
Elongation factor Tu
Elongation factor G
Elongation factor Ts
Elongation factor P
Ribosome recycling factor
Thr-tRNA synthetase
Ile-tRNA synthetase
Lys-tRNA synthetase
Gln-tRNA synthetase
Asn-tRNA synthetase
Ser-tRNA synthetase
Ala-tRNA synthetase
Pro-tRNA synthetase
tuf*
fusA*
tsf*
efp
frr
thrS
ileS
lysS
gltX
asnS
serS
alaS
proS
0?793
0?746
0?720
0?639
0?567
0?591
0?584
0?556
0?551
0?549
0?542
0?517
0?513
RNA polymerase d subunit
N utilization substance protein A
RNA polymerase v subunit
RNA polymerase b subunit
RNA polymerase b subunit
Polyribonucleotide NT
sp0493
nusA
sp1737
rpoB
rpoC
pnp
0?590
0?565
0?562
0?561
0?530
0?507
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Microbiology 150
Codon bias and relative mRNA levels in S. pneumoniae
Table 2. cont.
Role or subrole
Transport
Amino acids, peptides and amines
Sugars, organic alcohols and acids
Cations
Nucleotides
PTS
Unknown substrate
Other
Unknown functions
Enzymes of unknown specificity
General
Product
Gene
CAI
Branched-chain amino acid ABCT
Amino acid ABCT
Sugar ABCT
Putative sugar ABCT
Maltose/maltodextrin ABCT
Sugar ABCT
Non-haem iron-containing ferritin
Manganese ABCT
Iron ABCT
Uracil permease
Phosphocarrier protein HPr
PTS IIABC components
PTS IIB component
PEP-protein phosphotransferase
PTS IID component
PTS IIC component
PTS IIABC components
PTS IIC component
Mannose PTS IID component
Fructose PTS IIABC components
Mannose PTS IIC component
Mannose PTS IIAB components
Lactose PTS IIBC components
ABCT
ABCT
ABCT
ABCT
ABCT
ABCT
Bacteriocin transport accessory protein
Aquaporin
MATE efflux family protein DinF
Glycerol uptake facilitator protein
livJ
sp1241
msmK*
sp0092*
malX
sp1683
sp1572
psaA
sp0243
uraA
ptsH*
sp0758
sp0646
ptsI
sp0063
sp0647
sp1722
sp0062
sp0282
sp0877
manM
manL
lacE
sp2197
sp0867
sp1796
sp1690
sp0148
sp2230
bta
sp1778
dinF
sp1491
0?588
0?505
0?712
0?689
0?665
0?598
0?665
0?580
0?515
0?524
0?783
0?677
0?660
0?643
0?635
0?633
0?602
0?593
0?586
0?570
0?555
0?531
0?518
0?550
0?534
0?527
0?519
0?517
0?511
0?629
0?556
0?556
0?523
Oxidoreductase
Oxidoreductase
Oxidoreductase
Oxidoreductase
Elongation protein Tu family
Secreted 45 kDa protein
GTP-binding protein
sp1472
sp1471
sp1325
sp1588
sp0681
usp45
sp0004
0?640
0?621
0?584
0?579
0?595
0?529
0?504
fermentation pathways, such as sp0285 and sp2026, encoding alcohol dehydrogenases, and sp1161, encoding a subunit
of the enzyme that converts pyruvate into acetyl-CoA, were
also PHE (Table 2).
Some of the genes involved in the complex pneumococcal
network of sugar conversions were also PHE, such as the
gene of the enzyme that cleaves lactose (lacG), genes of
enzymes that convert galactose into glycolytic intermediates
(lacA, lacB and lacD), and malQ, which encodes an enzyme
involved in the degradation of maltodextrins, the first
http://mic.sgmjournals.org
digestion product of starch. S. pneumoniae would be able
to obtain energy easily under starvation conditions from
glycogen, since the genes of glycogen phosphorylase (sp2106)
and phosphoglucomutase (pgm) were PHE (Table 2).
Additionally, the PHE gene sp1804 (Table 2) shows high
similarity (>70 %) with the Enterococcus hirae gls24 gene
that encodes a stress protein playing an important role
during glucose starvation (Giard et al., 2000). In addition
to the glycolytic and the two fermentation enzymes described above, malQ, sp2106 and pgm also showed high
mRNA amounts.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
2321
A. J. Martı́n-Galiano, J. M. Wells and A. G. de la Campa
Fig. 3. (A) Distribution of microarray fluorescence values. (B)
Median microarray fluorescence values (hatched bars, left-hand
axis) and percentage of genes over 6000 FU (black bars, righthand axis) in CAI groups.
Transcription and protein synthesis
As expected, genes of translation elongation factors were
PHE genes (Fig. 2B, Table 2), as well as others involved in
translation and transcription, such as those of aminoacyltRNA synthetases and RNA polymerase subunits (Fig. 2B,
Table 2). Most of these PHE genes also had high median
fluorescence values in the microarray experiments.
Aminoacyl-tRNA synthetases had a median FU value of
4762, whereas the value for RNA polymerase subunits was
12 506 FU. In contrast, the genes of proteolytic enzymes,
although they generally had very high fluorescence values
(median FU of 3853), were not PHE (Fig. 4B), suggesting
that proteolysis is enhanced under the laboratory culture
conditions. Among the genes encoding chaperones, only
dnaK and tig showed both high CAI and fluorescence
values, being the only chaperones included in the 25 genes
with the highest CAI values.
On the other hand, most RP genes had fluorescence values
higher than the genome median but lower than 6000 FU
2322
Fig. 4. Global comparison between CAI and fluorescence
microarray values in the whole genome (A), low- and middleexpressed gene groups (B), and PHE gene groups (C). The
line corresponds to the linear regression analysis for the whole
genome and is also shown in (B) and (C). Gene function
symbols: ., two-component systems; &, DNA transformation;
+, fatty acid metabolism; #, proteases; n, glycolysis; 6, elongation factors; e, aminoacyl-tRNA synthetases; *, RP genes.
(median 3210), plotting in the low-right quadrant of
Fig. 4C, indicating that codon bias might be a more
important factor than the amount of mRNA for the
general abundance of RP proteins. Genes involved in
amino-acid biosynthesis had quite homogeneous CAI
values (generally <0?400), in accordance with the general
tendency of the genome. However, much higher values
were found in specific genes (ilvC, gdhA, metE, asd, cysM
and glnA; Table 2), a feature that has been associated with
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Microbiology 150
Codon bias and relative mRNA levels in S. pneumoniae
control-pathway enzyme genes (Karlin & Mrazek, 2000).
None of these genes showed high fluorescence values, which
may be related to the abundance of casein-derived amino
acids in THYE medium.
Transporters
S. pneumoniae has one of the highest proportions (30 %) of
sugar transporter genes among the prokaryotic genomes
(Tettelin et al., 2001), seeming to be highly adapted to
compete for sugar nutrients with other respiratory tract
micro-organisms. Several genes of sugar transporters and
phosphotransferase systems were PHE (Table 2). However,
under the rich and stable sugar environment of the THYE
medium, only a few of these genes showed high mRNA
amounts: ptsH, ptsI and sp0758 of the phosphotransferase
system, and maltosaccharide transporter malX.
In addition, some genes for Fe and Mn transporters were
also PHE (Table 2), possibly reflecting an adaptation to
pathogenicity, given the vital importance of the acquisition
of these elements inside the host (Jakubovics & Jenkinson,
2001). Among them, the psaABC operon encoding the Mn
transporter also had relatively high levels of transcripts.
Oxidative metabolism
Genes involved in oxidant species detoxification and other
redox reactions were PHE (trx, nox, sodA, fld and trxB;
Table 2). Likewise, four of the genes classified as oxidoreductases in the unknown-specificity enzyme group
(sp1325, sp1471, sp1472 and sp1588), and psaA, part of
an Mn transporter involved in anti-oxidative defence,
were also strongly PHE (Table 2). Taken together, these
data suggest that defence against oxidative species is highly
developed in S. pneumoniae, possibly as a consequence of
its ability to colonize and persist in the nasopharynx,
where partial oxygen pressure is high. Consistent with
this hypothesis, nox, sodA and psaA, which are essential for
infection (Auzat et al., 1999; Yesilkaya et al., 2000; Tseng
et al., 2002) also appeared to be transcribed at high levels
(11 200, 5630 and 12 987 FU, respectively). In spite of the
anaerobic metabolism of S. pneumoniae, one of the highest
CAI and fluorescence values (0?738 and 11 683 units,
respectively) corresponded to the pyruvate oxidase gene,
spxB, which is one of the more abundant polypeptides
of the transparent variants of S. pneumoniae (Overweg
et al., 2000). This enzyme is also essential for infection
(Spellerberg et al., 1996), and produces, in the presence of
oxygen, acetyl-phosphate and hydrogen peroxide. The latter
is an important pneumococcal virulence factor (Duane
et al., 2000), which additionally could cause an inhibitory
effect on the growth of competitive microbes in the upper
respiratory tract (Pericone et al., 2000).
regulators, with mean CAI values of 0?247 and 0?281,
respectively. Additionally, low CAI values were also
calculated for the 35 genes involved in prosthetic group/
cofactor biosynthesis and the 19 genes of aromatic aminoacid biosynthesis with mean CAI values of 0?292 and
0?294, respectively. Some of these gene groups also had
low median fluorescence values: regulators (914 FU,
n=50), TCS (1399 FU, n=26) (Fig. 4B) and cofactorvitamin biosynthesis (1542 FU, n=34).
Low CAI values were also calculated for 24 competence
genes (mean CAI of 0?269; Fig. 2B), and most also had low
fluorescence values (median 868 FU, n=21) (Fig. 4B).
These genes localized in the central part of the COA
plot, with the exception of comD, comE and comF, which
localized in the right horn, and had G+C contents of
32?0 %, 30?7 % and 36?2 %, respectively. Consequently,
they could be recently acquired genes. It is worth
emphasizing that S. pneumoniae becomes naturally competent for only a few minutes, resulting in rapid changes
in its protein profile (Morrison & Baker, 1979), and that
constitutive activation of the competence regulon could
be deleterious for the cell (Martin et al., 2000). Thus it is
possible that the presence of rare codons in competence
genes could be a mechanism that limits translation, thereby
minimizing adverse physiological stresses prior to induction of competence-gene expression, as suggested in the
case of some E. coli regulatory genes (Kronigsberg &
Codson, 1983). In accordance with this hypothesis, other
mechanisms negatively controlling expression of competence involve the cleavage of competence factors by the
ClpP protease (Chastanet et al., 2001), and the action of
the inhibitor of the competence-stimulator peptide (Berge
et al., 2001). In contrast, the recA gene had a moderately
high CAI (0?489) and a high fluorescence value (8286 FU),
being the only competence gene that appears in the left
horn of in the COA, probably because it is involved in
multiple cellular processes.
Virulence factors
Genes expressed at low levels
Virulence factors include capsule and cell-wall biosynthesis
enzymes, pneumolysin, autolysin, neuraminidase, IgA1
protease, and some surface proteins (Paton et al., 1993).
Nearly all these genes had CAI values of 0?250 to 0?350, and
could be considered medium-expressed genes. Nevertheless,
psaA was PHE. Most genes of capsule biosynthesis, as well
as nanB, pspA, iga, genes of choline-binding proteins (cbpC
and cbpF), and lytB appear in the right horn (Fig. 1C) of
the COA, suggesting a recent acquisition by horizontal
transfer. In agreement with this idea, the G+C contents of
the cps4EFGH capsular genes, nanB and pspA were 27?8 %
to 33?5 %, 33?4 %, and 35?0 %, respectively, which is lower
than that of the bulk of the genome coding sequences
(39?7 %).
Low CAI values were calculated for genes with a putative
regulatory function, which included 27 genes of twocomponent systems (TCS) (Fig. 2B) and 62 general
Apparently there are two mechanisms that determine the
persistence/virulence of S. pneumoniae, operating on different time scales. One is the optimization of codon usage,
http://mic.sgmjournals.org
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
2323
A. J. Martı́n-Galiano, J. M. Wells and A. G. de la Campa
as detected by CAI analysis for sugar-transporter and
oxidative-metabolism genes, possibly reflecting a long-term
progressive adaptation to persistence in carrier hosts. The
other is the recent acquisition of new virulence factors by
horizontal transfer, as detected by COA and G+C content.
Hoskins, J., Alborn, W. E., Arnold, J. & 37 other authors (2001).
Genome of the bacterium Streptococcus pneumoniae strain R6.
J Bacteriol 183, 5709–5717.
Ikemura, T. (1981). Correlation between the abundance of
Escherichia coli transfer RNAs and the occurrence of the respective
codons in its protein genes. J Mol Biol 146, 1–21.
Jakubovics, N. S. & Jenkinson, H. F. (2001). Out of the iron age:
new insights into the critical role of manganese homeostasis in
bacteria. Microbiology 147, 1709–1718.
ACKNOWLEDGEMENTS
A. J. M.-G. gratefully acknowledges receipt of a fellowship from the
Comunidad Autónoma de Madrid, Spain. This study was supported by
grant BIO2002-01398 from the Ministerio de Ciencia y Tecnologı́a.
J. M. W. acknowledges financial support for the microarray construction from EC contract QLK2-CT-2000-00543. We wish to thank Karin
Overweg and Mark Reuter for advice and discussion concerning the
microarray work.
Karlin, S. & Mrazek, J. (2000). Predicted highly expressed genes of
diverse prokaryotic genomes. J Bacteriol 182, 5238–5250.
Karlin, S., Mrazek, J., Campbell, A. & Kaiser, D. (2001). Charac-
terization of highly expressed genes of four fast-growing bacteria.
J Bacteriol 183, 5025–5040.
Kronigsberg, W. & Codson, G. N. (1983). Evidence for use of rare
codons in the dnaG gene and other regulatory genes of Escherichia
coli. Proc Natl Acad Sci U S A 80, 687–691.
Kurland, C. G. (1991). Codon bias and gene expression. FEBS Lett
285, 165–169.
REFERENCES
Andersson, S. G. E. & Sharp, P. M. (1996). Codon usage in the
Mycobacterium tuberculosis complex. Microbiology 142, 915–925.
Auzat, I., Chapuy-Regaud, S., Le Bras, G., Dos Santos, D.,
Ogunniyi, A. D., Le Thomas, I., Garel, J. R., Paton, J. C. &
Trombe, M. C. (1999). The NADH oxidase of Streptococcus
pneumoniae: its involvement in competence and virulence. Mol
Microbiol 34, 1018–1028.
Berge, M., Garcia, P., Iannelli, F., Prere, M. F., Granadel, C., Polissi,
A. & Claverys, J. P. (2001). The puzzle of zmpB and extensive chain
formation, autolysis defect and non-translocation of choline-binding
proteins in Streptococcus pneumoniae. Mol Microbiol 39, 1651–1660.
Chastanet, A., Prudhomme, M., Claverys, J. P. & Msadek, T.
(2001). Regulation of Streptococcus pneumoniae clp genes and their
role in competence development and stress survival. J Bacteriol 183,
7295–7307.
Chavancy, G. & Garel, J. P. (1981). Does quantitative tRNA
adaptation to codon content in mRNA optimize the ribosomal
translation efficiency? Proposal for a translation system model.
Biochimie 63, 187–195.
Dagkessamanskaia, A., Moscoso, M., Hénard, V., Guiral, S.,
Overweg, K., Reuter, M., Wells, J. M. & Claverys, J. P. (2004).
Interconnection of competence, stress and CiaR regulons in
Streptococus pneumoniae: competence triggers stationary phase
autolysis of ciaR mutant cells. Mol Micro 51, 1071–1086.
Dopazo, J., Mendoza, A., Herrero, J. & 13 other authors (2001).
Annotated draft genomic sequence from a Streptococcus pneumoniae
type 19F clinical isolate. Micro Drug Resist 7, 99–125.
Dos Reis, M., Wernisch, L. & Savva, R. (2003). Unexpected
correlations between gene expression and codon usage bias from
microarray data for the whole Escherichia coli K-12 genome. Nucleic
Acids Res 31, 6976–6985.
Martin, B., Prudhomme, M., Alloing, G., Granadel, C. & Claverys,
J. P. (2000). Cross-regulation of competence pheromone production
and export in the early control of transformation in Streptococcus
pneumoniae. Mol Microbiol 38, 867–878.
McInerney, J. O. (1998). GCUA (General Codon usage Analysis).
Bioinformatics 14, 372–373.
Médigue, C., Rouxel, T., Vigier, P., Hénaut, A. & Danchin, A. (1991).
Evidence for horizontal gene transfer in Escherichia coli speciation.
J Mol Biol 222, 851–856.
Morrison, D. A. & Baker, M. F. (1979). Competence for genetic
transformation in pneumococcus depends on synthesis of a small set
of proteins. Nature 282, 215–217.
Overweg, K., Pericone, C. D., Verhoef, G. G., Weiser, J. N.,
Meiring, H. D., De Jong, A. P., De Groot, R. & Hermans, P. W.
(2000). Differential protein expression in phenotypic variants of
Streptococcus pneumoniae. Infect Immun 68, 4604–4610.
Paton, J. C., Andrew, P. W., Boulnois, G. J. & Mitchell, T. J. (1993).
Molecular analysis of the pathogenicity of Streptococcus pneumoniae:
the role of pneumococcal proteins. Annu Rev Microbiol 47, 89–115.
Pericone, C. D., Overweg, K., Hermans, P. W. M. & Weiser, J. N.
(2000). Inhibitory and bactericidal effects of hydrogen peroxide
production by Streptococcus pneumoniae on other inhabitants of the
upper respiratory tract. Infect Immun 68, 3990–3997.
Sharp, P. M. (1991). Determinants of DNA sequence divergence
between Escherichia coli and Salmonella typhimurium: codon usage,
map position, and concerted evolution. J Mol Evol 33, 23–33.
Sharp, P. M. & Li, W. (1987a). The codon adaptation index - a
measure of directional synonymous codon usage bias, and its
potential applications. Nucleic Acids Res 15, 1281–1295.
Sharp, P. M. & Li, W. H. (1987b). The rate of synonymous
Duane, P. G., Rubins, J. B., Weisel, H. R. & Janoff, E. N. (2000).
substitution in enterobacterial genes is inversely related to codon
usage bias. Mol Biol Evol 4, 222–230.
Identification of hydrogen peroxide as a Streptococcus pneumoniae
toxin for rat alveolar epithelial cells. Infect Immun 61, 4392–4397.
Spellerberg, B., Cundell, D. R., Sandros, J., Pearce, B. J., IdanpaanHeikkila, I., Rosenow, C. & Masure, H. R. (1996). Pyruvate oxidase,
Giard, J. C., Rince, A., Capiaux, H., Auffray, Y. & Hartke, A. (2000).
as a determinant of virulence in Streptococcus pneumoniae. Mol
Microbiol 19, 803–813.
Inactivation of the stress- and starvation-inducible gls24 operon has
a pleiotropic effect on cell morphology, stress sensitivity, and gene
expression in Enterococcus faecalis. J Bacteriol 182, 4512–4520.
Grosjean, H., Sankoff, D., Jou, W. M., Fiers, W. & Cedergren, R. J.
(1978). Bacteriophage MS2 RNA: a correlation between the stability
of the codon : anticodon interaction and the choice of code words.
J Mol Evol 12, 113–119.
2324
Tettelin, H., Nelson, K. E., Paulsen, I. T. & 36 other authors (2001).
Complete genome sequence of a virulent isolate of Streptococcus
pneumoniae. Science 293, 498–506.
Tseng, H. J., McEwan, A. G., Paton, J. C. & Jennings, M. P. (2002).
Virulence of Streptococcus pneumoniae: PsaA mutants are hypersensitive to oxidative stress. Infect Immun 70, 1635–1639.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
Microbiology 150
Codon bias and relative mRNA levels in S. pneumoniae
Wilkins, J. C., Homer, K. A. & Beighton, D. (2002). Analysis of
Streptococcus mutans proteins modulated by culture under acidic
conditions. Appl Environ Microbiol 68, 2382–2390.
Wright, F. (1990). The ‘effective number of codons’ used in a gene.
Gene 87, 23–29.
http://mic.sgmjournals.org
Yesilkaya, H., Kadioglu, A., Gingles, N., Alexander, J. E.,
Mitchell, T. J. & Andrew, P. W. (2000). Role of manganese-
containing superoxide dismutase in oxidative stress and
virulence of Streptococcus pneumoniae. Infect Immun 68, 2819–
2826.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 03:10:32
2325