Download The role of variable DNA tandem repeats in bacterial adaptation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Epigenetics in learning and memory wikipedia , lookup

Frameshift mutation wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Oncogenomics wikipedia , lookup

Mutation wikipedia , lookup

Metagenomics wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Ridge (biology) wikipedia , lookup

Public health genomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Human genetic variation wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Copy-number variation wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Genomic library wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Human genome wikipedia , lookup

Genetic engineering wikipedia , lookup

Gene expression programming wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Transposable element wikipedia , lookup

Non-coding DNA wikipedia , lookup

Pathogenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Genome (book) wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene wikipedia , lookup

Genome editing wikipedia , lookup

Gene expression profiling wikipedia , lookup

Minimal genome wikipedia , lookup

Designer baby wikipedia , lookup

Point mutation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome evolution wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
REVIEW ARTICLE
The role of variable DNA tandem repeats in bacterial
adaptation
Kai Zhou, Abram Aertsen & Chris W. Michiels
Department of Microbial and Molecular Systems (M²S), Faculty of Bioscience Engineering, Laboratory of Food Microbiology and Leuven Food
Science and Nutrition Research Centre (LFoRCe), KU Leuven, Leuven, Belgium
Correspondence: Chris W. Michiels,
Laboratory of Food Microbiology, Kasteelpark
Arenberg 23, B-3001 Leuven, Belgium.
Tel.: +32 16 321578;
fax: +32 16 321960;
e-mail: [email protected]
Received 17 April 2013; revised 13 July
2013; accepted 26 July 2013. Final version
published online 28 August 2013.
DOI: 10.1111/1574-6976.12036
Editor: Grzegorz Wegrzyn
MICROBIOLOGY REVIEWS
Keywords
polymorphic DNA; contingency loci; host–
pathogen interaction; stress tolerance;
evolution; phase variation.
Abstract
DNA tandem repeats (TRs), also designated as satellite DNA, are inter- or
intragenic nucleotide sequences that are repeated two or more times in a headto-tail manner. Because TR tracts are prone to strand-slippage replication and
recombination events that cause the TR copy number to increase or decrease,
loci containing TRs are hypermutable. An increasing number of examples illustrate that bacteria can exploit this instability of TRs to reversibly shut down or
modulate the function of specific genes, allowing them to adapt to changing
environments on short evolutionary time scales without an increased overall
mutation rate. In this review, we discuss the prevalence and distribution of
inter- and intragenic TRs in bacteria and the mechanisms of their instability.
In addition, we review evidence demonstrating a role of TR variations in bacterial adaptation strategies, ranging from immune evasion and tissue tropism to
the modulation of environmental stress tolerance. Nevertheless, while bioinformatic analysis reveals that most bacterial genomes contain a few up to several
dozens of intra- and intergenic TRs, only a small fraction of these have been
functionally studied to date.
Introduction
To cope with rapidly changing environmental conditions
and ensure their survival, unicellular organisms have
evolved a plethora of adaptation strategies (Aertsen &
Michiels, 2004, 2005). Most of these strategies are based
on transient alterations in gene expression in response to
stressful conditions, and well-known examples include the
SOS response (controlled by RecA and LexA), the general
stress response (regulated by the sigma factor RpoS), the
stringent response (mediated by pppGpp and ppGpp),
and the heat shock response (mainly controlled by RpoH;
Massey & Buckling, 2002; Foster, 2005; Saint-Ruf &
Matic, 2006; Foster, 2007; Jolivet-Gougeon et al., 2011).
In addition, adaptation can also stem from the acquisition of stochastic mutations that alter the genotype,
which become positively selected and fixed in a population if they coincide with a beneficial phenotype (Rando
& Verstrepen, 2007). However, an important drawback of
FEMS Microbiol Rev 38 (2014) 119–141
the latter strategy is that random mutations are more
often deleterious than beneficial.
Interestingly, neither the type nor the frequency of
mutational events is randomly distributed over the
genome, and some DNA sequences have evolved to be
mutational hotspots that drive the variability of genes
whose activity can impact the adaptive potential of their
host. One type of such special sequences that is very
abundant in prokaryotic and eukaryotic genomes is
known as tandem repeats (TRs), a major class of direct
DNA repeats. While at first TRs were considered to be
junk DNA without any biological function, studies of the
human genome have revealed some of these repetitive
sequences to be hypermutable and the cause of diseases
such as fragile X syndrome, spinobulbar muscular atrophy, and huntington disease (Hannan, 2010). In addition,
accumulating evidence points out the potential role of
TRs as engines of genetic variability and bacterial
adaptation. In this review, we therefore focus on the
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
120
K. Zhou et al.
identification and distribution of TRs in bacteria, the
mechanisms behind their variability, and their biological
significance in bacterial adaptation.
Identification and distribution of TRs in
bacterial genomes
Definition of TRs
TRs are nucleotide sequences that are directly repeated in
a head-to-tail manner. According to the conservation of
the repeated sequence, TRs are classified as identical/perfect TRs or degenerated/imperfect TRs, respectively
(Fig. 1). Furthermore, TRs are commonly classified into
three categories according to their repeat unit size,
although there is no consensus definition of these categories (Richard et al., 2008). Repeats with unit size varying
from 1 to 9, 10 to 100, and > 100 bp are termed microsatellites, minisatellites, and macrosatellites, respectively
(Lopes et al., 2006). The term ‘satellite DNA’ originally
refers to the very large arrays of tandemly repeated noncoding DNA (often hundreds of copies) that are characteristic of large eukaryotic genomes, but, in the context of
bacterial genomes, is also used to include small and intragenic TRs.
In silico identification of TRs
The increasing availability of genome sequences and
specialized bioinformatics software greatly facilitates the
search and identification of TR loci on a genomewide
scale, which obviously is a prerequisite for understanding
their distribution, predicting their function, and tracking
their evolution. A variety of algorithms have been developed for detecting TRs, but it is important to be aware
that these may differ in their ability to detect different
types of TRs (Merkel & Gemmell, 2008; Treangen et al.,
2009; Kajava, 2012). Hence, the choice of search tool
should be determined by the TR type of interest, or the
parallel use of several algorithms is advisable when a wide
screen for TRs is performed. Furthermore, parameter settings (i.e. alignment weights, definition of repeats, and
threshold scores) can also strongly affect outcome in terms
of number and consensus motif of TRs (Lim et al., 2013).
In particular, problems are still commonly encountered in
detecting imperfect TRs (Leclercq et al., 2007; Schaper
et al., 2012). Several algorithms are freely available
online, such as Tandem Repeat Finder (Benson, 1999) and
IMEx (Mudunuri & Nagarajaram, 2007). In addition,
several databases of annotated TRs in prokaryotes have
been established, such as TRs DB (http://minisatellites.
u-psud.fr), PSSRDb (http://pssrdb.cdfd.org.in) and
MICAS (http://180.149.48.108/micas/index.php). In the
next section, we will review the major findings from some
recent in silico studies of the genomic distribution of TRs
in bacteria.
The distribution of TRs in bacterial genomes
The analysis of TRs in bacterial genomes so far has
mainly focused on microsatellites with unit size 1–6 bp,
also termed ‘simple sequence repeats (SSRs)’. A number
of general observations regarding the distribution of SSRs
can be stated. First of all, the abundance of SSRs in bacteria is lower than that in eukaryotes (Schlotterer et al.,
2006). Nevertheless, the number of SSRs is orders of
(a)
IdenƟcal / Perfect TRs
(unit sequence = 100% conserved)
AGCTG
AGCTG
AGCTG
AGCTG
AGCTG
Degenerated / Imperfect TRs
(unit sequence < 100% conserved)
AGCTG
TGCTG
AGGTG
AGCTG
AGCTC
(b)
Microsatellite
(unit size 1-9 bp)
Minisatellite
(unit size 10-100 bp)
Macrosatellite
(unit size > 100 bp)
Fig. 1. Schematic representation of different types of TRs. (a) Different conservation of repeat unit sequence. (b) Different sizes of repeat unit.
Space between repeat units has only been introduced to improve visual clarity of the figure.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
FEMS Microbiol Rev 38 (2014) 119–141
Variable tandem repeats and bacterial adaptation
magnitude higher than that of other repeat types (i.e.
minisatellite and macrosatellite) in the genomes of most,
if not all bacteria. Of course, this is not unexpected
because this SSR count included even mononucleotide
trimers (e.g. AAA), which account for about 70% of the
total number of SSRs. While SSRs are generally believed
to contribute to genome polymorphism and adaptation
potential of bacteria (Kassai-Jager et al., 2008), the contribution of these very small SSRs like mononucleotide
trimers is probably limited. In fact, a rough threshold of
minimum TR unit number (4–9) has been noted, below
which a SSR is not likely to mutate or be variable (Lai &
Sun, 2003; Dettman & Taylor, 2004; Kelkar et al., 2010).
Intriguingly, heptameric repeats were found to be overrepresented among these SSRs in most prokaryotes, and
it was hypothesized that the seemingly preferred 7 bp
length of a repeat unit might relate to the DNA segment
size that interacts with the active site of the DNA polymerase, thus facilitating the occurrence of polymerase
slippage (Mrazek et al., 2007).
A remarkable feature of SSRs is their widely diverse distribution across species, even closely related ones, and this
may indicate that they are subject to rapid evolutionary
change (Yang et al., 2003; Mrazek, 2006; Kassai-Jager et al.,
2008). Analysis of more than 300 prokaryotic genomes
showed that the distribution of SSRs varied with the bacterial species, genome size, and G + C content (Mrazek
et al., 2007). More specifically, SSRs with small motif
(1–4 bp) are more abundant in small genomes and particularly in host-adapted pathogens with reduced genomes
(< 2 Mb) and low G + C content (< 40%), such as Mycoplasma and Haemophilus spp. (Moxon et al., 2006; Treangen et al., 2009). In contrast, SSRs with a larger motif
(5–11 bp) are more frequent in nonpathogens and opportunistic pathogens with large genomes (> 4 Mb) and high
G + C content (> 60%), such as Burkholderia and Anabaena spp. Based on this observation, it was hypothesized that
the differential representations of SSRs in bacteria may correlate with pathogenicity, but more work is needed to corroborate this. Another interesting observation is that some
relatively large bacterial genomes (e.g. Pseudomonas aeruginosa, c. 5 Mb) have fewer SSRs than would be predicted
based on their genome properties, but harbor comparatively more two-component sensor transducers. In
contrast, some host-adapted pathogens with small genome
size (i.e. Haemophilus influenzae, Neisseria meningitidis,
and Helicobacter pylori) have comparatively more SSRs, but
less two-component sensor transducers (Moxon et al.,
2006). Thus, it seems that environmental adaptability in
host-adapted pathogens depends primarily on SSR variations, while in opportunistic pathogens with a more versatile lifestyle it depends primarily on two-component sensor
transducers.
FEMS Microbiol Rev 38 (2014) 119–141
121
Closer examination of the SSR distribution across the
genome shows significant differences in coding and noncoding regions. Because bacterial genomes are more compact than those of eukaryotes, they have comparatively
more intragenic than intergenic SSRs. For example, in
Escherichia coli K-12, 79.5% of SSRs locate in coding
regions (Gur-Arie et al., 2000), whereas in the genome
of the Japanese pufferfish (Fugu rubripes), only 11.6% of
SSRs are intragenic (Edwards et al., 1998). Generally, long
mono- and dinucleotide SSRs are excluded from coding
regions, probably because they have a higher probability
to rearrange and cause frameshift mutations in genes (Coenye & Vandamme, 2005; Ackermann & Chao, 2006; Orsi
et al., 2010; Lin & Kussell, 2012). In contrast, SSRs whose
unit size is a multiple of three nucleotides (3, 6, 9 …) are
overrepresented in open reading frames (ORFs) because
their expansion or contraction does not disrupt the reading frame (Mrazek et al., 2007). However, exceptions have
been reported. For example, tetranucleotide SSRs of H. influenzae are exclusively found in ORFs, which is consistent
with their role in phase variation (Power et al., 2009). An
interesting situation exists in the mycoplasmas, where long
trinucleotide repeats are overrepresented in Mycoplasma
genitalium, Mycoplasma gallisepticum, and Mycoplasma
hyopneumoniae, but occur mainly in intergenic regions in
the former two species, but in coding regions in the latter
one (Mrazek, 2006). This difference in distribution is also
reflected in different functional roles. In M. gallisepticum,
the most prominent trinucleotide TRs are the GAA
repeats in the 5′ untranslated region of the 42 up to 70
vlpA adhesin gene paralogs that exist in each strain, which
regulate vlpA gene expression (Glew et al., 1998, 2000;
Liu et al., 2000; Papazisi et al., 2003). In contrast,
M. hyopneumoniae trinucleotide repeats are found mostly
within hypothetical ORFs, but also in some adhesins, and
their contraction or expansion results in variability of
amino acid repeats that are believed to play a role in protein–protein interaction or adhesion (Mrazek, 2006).
A more detailed study on the occurrence of intragenic
TRs in 44 bacteria and archaea revealed additional
features (Lin & Kussell, 2012). Intragenic SSRs were
found more frequently near the termini (5′ and 3′ ends)
of the ORF rather than in the middle, which most likely
stems from biophysical constraints of protein structure.
In addition, SSR-induced frameshifts at the 3′ end are less
harmful than at other parts, because most of the
upstream coding region will not be affected. Nevertheless,
an overrepresentation of SSRs was found in the 5′ end in
ORFs of pathogens, probably because this allows SSRinduced frameshifts to function as an ON/OFF switch for
these ORFs, which can be advantageous for pathogens
because it facilitates rapid adaptation of populations.
Similar observations had already been made earlier for
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
122
some intragenic mononucleotide repeats at the 5′ end of
genes (van Passel & Ochman, 2007; Janulczyk et al., 2010;
Orsi et al., 2010). However, it remains unclear and often
difficult to prove whether this type of distribution bias of
intragenic SSRs is linked to selection pressures in bacteria.
An argument in favor of such a link is that intragenic
SSRs show a preference for certain categories of genes. In
both Gram-negative (e.g. Haemophilus and Helicobacter)
and Gram-positive (e.g. Streptococcus) pathogens, SSRassociated genes frequently encode virulence factors, cell
surface
components,
and
restriction–modification
enzymes (van Belkum, 1999; Moxon et al., 2006; Guo &
Mrazek, 2008; Power et al., 2009; Janulczyk et al., 2010).
On the other hand, several intragenic SSRs with numerous repeat copies and a unit size that is not a multiple of
three are also found in housekeeping genes whose products are essential for important cellular processes, such as
cell division, energy production, and DNA replication
and repair (Guo & Mrazek, 2008). Obviously, corresponding TR rearrangements leading to reading frame
disruption are anticipated to be detrimental or even lethal
for the cell, and it remains unclear why such TRs have
been maintained during evolution.
Intergenic SSRs also show a nonrandom distribution,
being found more frequently in the immediate vicinity of
genes than at distant positions. For example, intergenic
SSRs of E. coli K-12 concentrate in a region up to 200 bp
from the start codon, which contains proximal regulators
of gene expression (Gur-Arie et al., 2000). Another study
showed that in most cases, the intergenic SSRs with
numerous copies are located upstream of the first gene in
prokaryotic operons (Guo & Mrazek, 2008). Together,
both studies reflect the potential role of intergenic SSRs
in the regulation of gene expression.
The variability of TRs
The variability of TRs is thought to be one of the drivers
of genomic plasticity. The regions containing TRs are
potentially hypermutable by contraction (deletion) or
expansion (insertion) of TR units, and mutation frequencies up to 10 1 have been reported in bacteria (Rando &
Verstrepen, 2007). Not surprisingly, the polymorphisms
found in TR loci provide a foundation for DNA genotyping approaches such as variable number TR-based typing
or multilocus variable repeat analysis, which are commonly applied for pathogen typing (Lindstedt, 2005,
2011; reviewed in Chiou, 2010).
The molecular mechanisms of TR variation
Based on extensive studies in both plasmid-based and
chromosomal systems, two nonexclusive mechanisms,
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
replication slippage and recombination, are currently
widely accepted to explain TR variation (Pearson et al.,
2005; Bichara et al., 2006; Gemayel et al., 2010). With
regard to the slippage mechanism, several models have
been proposed, and the most common one is strandslippage mispairing (SSM), also called DNA slippage or
polymerase slippage (Kornberg et al., 1964; Streisinger
et al., 1966). This model proposes that TR rearrangements result from strand slippage during DNA replication. The process is initiated by the formation of a bulge
structure of unpaired repeats either on the template or
on the nascent strand. If the bulge is present on the template strand, it will result in a TR contraction (deletion)
in the newly synthesized DNA. In contrast, TR expansion
(insertion) will result when a bulge forms on the nascent
strand (Fig. 2). Currently, substantial evidence supports
the involvement of replication in TR instability in bacteria. In E. coli, for example, triplet nucleotide tracts in a
plasmid or in the chromosome were dramatically destabilized in a dnaQ49 mutant. The dnaQ gene encodes the
3′–5′ exonucleolytic e-subunit of DNA polymerase III,
which is involved in proofreading, and it was therefore
suggested that this mutant failed to correctly remove
slipped structures in the TR tracts during replication (Iyer
et al., 2000; Zahra et al., 2007). Moreover, mutations in
the a subunit (encoded by dnaE) of DNA polymerase III
holoenzyme, c and s subunits of the clamp-loading complex (encoded by dnaX), and b clamp (encoded by dnaN)
have also been shown to increase instability of microsatellites and tandemly repeated DNA sequences (reviewed in
Bichara et al., 2006). In general, the effect of DNA replication on TR rearrangements provides evidence for the
replication slippage mechanism, because a prerequisite of
this mechanism is that DNA replication is stalled by the
secondary loop structures formed by TRs. Notably, the
SSM mechanism is not only widely accepted to account
for the majority of SSR variations, it is also invoked to
explain the genesis of TRs from unrepeated DNA (Levinson & Gutman, 1987; Waite et al., 2003; Lindb€ack et al.,
2011).
Besides SSM, recombination is considered as another
mechanism of TR instability that could explain phenomena that cannot be explained by the slippage mechanism.
Generally, recombination is more important for the rearrangement of TRs with large unit size, whereas SSM is
the dominant mechanism underlying variation of TRs
with small unit size (Bi & Liu, 1996; Richard & P^aques,
2000; Bzymek & Lovett, 2001; Rocha, 2003; Gemayel
et al., 2010). Both homologous (RecA-dependent) and
illegitimate (RecA-independent) recombination can be
involved. Several models have been proposed for the
recombination mechanism, such as unequal crossover and
intramolecular recombination (Fig. 3), and evidence for
FEMS Microbiol Rev 38 (2014) 119–141
123
Variable tandem repeats and bacterial adaptation
5’
3’
3’
ReplicaƟon
5’
3’
Strand
dissociaƟon
5’
3’
5’
ContracƟon
Expansion
5’
3’
3’
5’
5’
3’
3’
5’
Fig. 2. Diagram illustrating the replication slippage mechanism of TR rearrangement. Repeat units are shown as blocks on nascent (light green)
and template (dark green) DNA strands. Shown is a partially replicated TR region undergoing transient dissociation and mispairing, resulting in a
bulge on the template or the nascent strand and leading to insertion or deletion of TRs, respectively. Space between repeat units has only been
introduced to improve visual clarity of the figure.
the involvement of recombination in TR variations is
accumulating. For example, it has been suggested that
double-strand DNA break (DSB) repair can induce TR
instability via homologous recombination in prokaryotes
and eukaryotes (Hebert & Wells, 2005; reviewed in Richard et al., 2008; Malkova & Haber, 2012), and mechanistic models explaining this process have been elaborated.
One such model is the DSB repair slippage model, which
combines the double Holliday junction intermediate
pathway with the strand-slippage model (Richard et al.,
2008). In an alternative explanation, the synthesis-dependent strand annealing pathway also contributes to the TR
rearrangements mediated by DSB repair (P^aques et al.,
1998, 2001; Richard et al., 1999; Richard & P^aques,
2000). This model is proposed for explaining TR rearrangements that are rarely associated with crossover
events (P^aques et al., 1998, 2001; Richard et al., 1999).
Factors influencing TR rearrangement
frequencies
While TRs generally are intrinsically prone to incur contraction or expansion, the actual frequency of these events
FEMS Microbiol Rev 38 (2014) 119–141
can vary widely depending on both intrinsic (structural)
and extrinsic (environmental) factors. Regarding intrinsic
TR features, a positive correlation has been established
between TR copy number and rearrangement frequency
in several studies. Through in silico genome analysis of
multiple strains of 42 fully sequenced prokaryotic species,
Lin & Kussell (2012) observed that the variability of three
types of SSRs (monomeric, dimeric, and trimeric repeats)
increased dramatically with the number of repeat units. In
another study, an exponential relation between the number of repeat units and rearrangement frequency was
observed in a comparison of 30 artificially constructed
TRs (unit lengths of 2, 10, and 20 nucleotides; number of
units between 2 and 50; sequence conservation between
62.5% and 100%; Legendre et al., 2007). This also
accounts for the fact that long TRs are relatively uncommon (Lai & Sun, 2003). Similar findings have been
reported in several other studies (Goldstein & Clark,
1995; Brinkmann et al., 1998; Lai & Sun, 2003; Vogler
et al., 2006). Likewise, a positive relationship is generally
found between TR mutation frequency and the size of the
repeat unit (Sia et al., 1997; Schug et al., 1998; Eckert &
Hile, 2009; Bayliss et al., 2012) as well as the degree of
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
124
K. Zhou et al.
5’
3’
3’
5’
(a)
(b)
5’
3’
Unequal
crossover
5’
5’
3’
Intramolecular
recombinaƟon
3’
5’
5’
3’
Expansion
ContracƟon
3’
3’
5’
3’
5’
3’
5’
5’
3’
3’
5’
ContracƟon
Fig. 3. Diagram illustrating the recombination mechanism of TR rearrangement. DNA molecules with repeat units represented as blocks are
shown in different colors. (a) Model of unequal crossover. When repeat units of two different DNA molecules misalign, unequal crossover can
occur, resulting in repeat expansion in one crossover product and repeat contraction in the other. (b) Model of intramolecular recombination.
When recombination occurs between repeat units within a DNA molecule, two products are generated that have undergone a repeat
contraction.
conservation between repeats (Legendre et al., 2007). In
addition, the GC content of the TR is also an important
factor determining stability, because repeated sequences
are prone to form diverse non-B DNA structures (i.e.
hairpins and triplexes), which may cause pausing of the
DNA polymerase and replication fork collapse, and in
turn necessitate intervention of the repair and recombination machinery to reinitiate replication (Wells et al., 2005;
Choudhary & Trivedi, 2010). Besides, the orientation of
the repeats with respect to the direction of replication can
affect the mutation ratio as well, because TRs are more
prone to form secondary structures in one orientation
than in the other (Hebert et al., 2004), and replication
fidelity is not equal in the leading strand compared with
the lagging strand (Gawel et al., 2002). Intriguingly, based
on a theoretical model, Lai & Sun (2003) suggested that
expansion occurs more frequently for short microsatellites, while contraction occurs more frequently for long
ones, suggesting the rearrangement pattern might be
dependent on the repeat type. However, the lack of experimental evidence so far cannot validate this observation.
Aside from intrinsic factors, extrinsic environmental
conditions may also affect TR rearrangement frequencies.
For instance, it was shown that several TRs used in a
multilocus TR typing scheme of E. coli showed enhanced
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
variation at increased growth temperature and upon starvation in E. coli O157:H7, but not upon irradiation (Cooley
et al., 2010). Likewise, in sclB of Streptococcus pyogenes,
encoding a collagen-like surface protein, TR variations
occurred during growth in fresh human blood, but not in
medium (Rasmussen & Bj€
ork, 2001). However, the
underlying mechanisms by which environmental stresses
affect the TR mutation frequency are poorly understood.
The phenotypical impact of intergenic
TR variations in bacteria
Accumulating evidence indicates that rearrangements of intergenic TRs can confer transcriptional evolvability (Jansen
et al., 2012). More specifically, SSRs positioned as cis-regulatory elements around the promoter region can induce
phase variation (i.e. stochastic, high frequency, reversible
switching of genotype, and/or phenotype) by modulating
the transcription of the corresponding genes. Most studies
indicate that intergenic SSRs, except monomeric SSRs,
involved in phase variation tend to be A/T rich, which
makes them prone to melting and SSM. In this section, we
review examples of this mechanism of phase variation
according to the different positions of intergenic TRs relative to the transcriptional start site (Fig. 4; Table 1).
FEMS Microbiol Rev 38 (2014) 119–141
125
Variable tandem repeats and bacterial adaptation
UAS
ATG
RNA pol
–35
SD
+1
–10
B
A
ORF
C
Fig. 4. Scheme showing possible positions of intergenic TRs in a
standard promotor region that can cause phase variation. Indicated
are an ORF, a promoter with 10 and 35 signatures that are
recognized by RNA polymerase (RNA pol), an upstream regulatory
sequence to which activators or repressors can bind (UAS), the
transcription initiation site (+1), a Shine–Dalgarno sequence for
ribosome binding (SD), and a translation start codon (ATG). Repeats
in regions A and B (double-headed arrows) can modulate gene
expression by affecting transcription initiation (see text TRs upstream
of 35 site of promotor and TRs between 35 and 10 sites of
promotor), whereas repeats in region C operate via as yet
unidentified means (see text TRs between transcriptional start site and
ORF; adapted from van der Woude & Baumler, 2004).
TRs upstream of
35 site of promotor
When TRs are present upstream of the 35 site, the
repeat copy number is expected to affect the binding of
transcription factors and thus modulate gene expression.
This has been reported in the pathogen N. meningitidis,
where expression of the nadA gene (encoding a protein
that functions as invasin and adhesin) is regulated by the
number of tetrameric repeats (TAAA) within the
upstream region of the RNA polymerase binding site,
resulting in three significantly distinct transcriptional levels (high, intermediate, and low). The frequency of phase
variation between these levels has been estimated at
c. 4.4 9 10 4. Remarkably, the levels of nadA transcription show a periodic rather than a monotonous relation
to the number of repeat units. As such, the transcription
level varies in the order low–high–intermediate–low–high
for repeat copy numbers of 4–5–6–7–8 and again for
9–10–11–12–13 repeats. Further mechanistic studies indicated that variation of the tetrameric repeats affects the
binding of the integration host factor transcriptional
regulator protein to the nadA promoter (Martin et al.,
2003, 2005). More recently, Metruccio et al. (2009) discovered that depending on the number of TAAA repeats,
a novel repressor (NadR) prevents transcription of nadA
through binding of two operators flanking the variable
tetrameric repeat tract on both sides. As a result, it was
proposed that alteration of the spacing between these two
operators by variation of the number of TAAA repeats
may affect the ability of NadR to repress nadA expression
(Metruccio et al., 2009).
TRs between
35 and
10 sites of promotor
The spacing between the 35 and 10 sites of a standard
promoter is critical for the binding efficiency of RNA
polymerase (Fig. 4), and the optimal distance is around
17 bp in bacteria. Consequently, when TRs locate in that
region, copy number changes are expected to modulate
the transcription level of the genes in the transcriptional
unit. An example of this is found in the host-adapted
pathogen H. influenzae, which adheres to human epithelial cells with the help of LKP fimbriae (also called long,
thick pili). These fimbriae are encoded by the hif gene
cluster and important for H. influenzae infection at different stages (van Ham et al., 1993). Phase variation in the
expression of these fimbriae is mediated by a string of
dinucleotide repeats between the 35 and 10 sites
within the overlapping, but divergent promoter regions of
hifA and hifB genes, with TR copy numbers of 9, 10, or
11, respectively, resulting in no, high, and low expression
of both genes.
Another intriguing example is the FetA protein of Neisseria gonorrhoeae, an iron-repressible protein functioning
as ferric enterobactin receptor. The expression of FetA
exhibits extremely rapid phase variation (switching
frequency up to 1.3% per generation) correlating with
polymorphism of a poly-C tract between the 10 and
35 regions of the fetA promoter. The various lengths of
the poly-C tract result in either high or low expression.
Table 1. Overview of mechanisms by which intergenic TRs can modulate gene expression
Location of TRs
Mechanisms
References
Upstream of
Affects transcription initiation by modifying binding
affinity of regulatory proteins
Affects transcription initiation by altering the
distance of promoter elements
Miller et al. (1987), Martin et al. (2003, 2005),
Metruccio et al. (2009)
Willems et al. (1990), Yogev et al. (1991), van
Ham et al. (1993), Sarkari et al. (1994),
Carson et al. (2000), van der Ende et al. (2000)
Lafontaine et al. (2001), Attia & Hansen (2006)
Between
35 site
35 and
10 sites
Between transcriptional start and ORF
Between two separate transcription
start sites
May modify binding affinity of regulatory
proteins or mRNA stability
Unknown
Dawid et al. (1999)
ORF, open reading frame; TR, tandem repeat.
FEMS Microbiol Rev 38 (2014) 119–141
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
126
It was suggested that phase variation of FetA reflects a
balance between the advantages of iron scavenging, on
the one hand, and evasion of the host immune response
(based on FetA immunogenicity), on the other (Carson
et al., 2000).
A more complex situation is found in the PorA outer
membrane protein in N. meningitidis. Not only is expression of the porA gene stochastically modulated by two
variable homopolymeric tracts (poly-G and poly-T)
between the 35 and 10 sites of the promoter region,
there is also a variable poly-A tract within the porA
coding region. Both sites are believed to serve evasion of
the host immune response to this protein and explain the
poor efficacy observed for PorA-based vaccines (van der
Ende et al., 2000).
Further examples include the genes encoding the lipoprotein Vlps of Mycoplasma hyorhinis (Yogev et al.,
1991), the fimbrial subunits of Bordetella pertussis (Willems et al., 1990), and the outer membrane protein Opc
of N. meningitidis (Sarkari et al., 1994), and it can be
concluded that variable SSRs within the 35 and 10
region represent a widespread strategy of phase variation
by modulated gene expression in pathogens.
TRs between transcriptional start site and ORF
While intergenic TR-dependent phase variations in bacteria mostly belong to the two classes described above,
some cases are mediated by TRs located between the transcriptional start site and the ORF. In the Gram-negative
pathogen Moraxella catarrhalis, the UspA1 protein functions as an adhesin to mediate binding to human epithelial cells (Lafontaine et al., 2000). Its expression was
shown to be phase variable and correlated with adherence
capacity. Sequence analysis revealed a variable homopolymeric poly-G tract between the transcriptional start site
and the start codon to be responsible for the variability
in uspA1 expression. Stable mRNA and strong expression
of UspA1 was detected with a 10-bp G repeat tract, while
truncated mRNA and weak expression occurred with a 9bp G tract. Based on these observations, it was proposed
that alterations in the poly-G tract affect the binding efficiency of transcriptional regulators and/or the stability of
the uspA1 mRNA (Lafontaine et al., 2001).
Interestingly, a similar case was recently uncovered in
another outer membrane protein of M. catarrhalis,
UspA2, which is involved in serum resistance and vitronectin binding (Attia & Hansen, 2006). It was observed
that a tetrameric (AGAT) repeat tract, located between
the uspA2 transcription start site and the start codon, is
highly variable in M. catarrhalis isolates. Moreover, the
expression level of UspA2 and, as a result, the serum
resistance and vitronectin binding capacity of the cells
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
were increased as the repeat copy number increased, possibly because the variable number of AGAT repeats may
affect the secondary structure of the UspA2 mRNA
transcript.
TRs between two separated transcription start
sites
This special example of phase switching induced by
variable intergenic TRs has so far only been described in
H. influenzae (Dawid et al., 1999). HMW1 and HMW2
are two adhesins of H. influenzae that exhibit different
cellular binding specificities and are encoded by two separate chromosomal loci hmw1AC and hmw2AC, respectively. Both genes show 80% similarity, suggesting that
they can be considered as alleles (Barenkamp & Leininger,
1992; Ecevit et al., 2004). Promoter analysis revealed two
transcription start sites (P1 and P2) within the upstream
region of each gene and a heptameric (ATCTTTC) TR
array between P1 and P2 of both genes. The occurrence of
variations in the number of TR copies has been confirmed
in H. influenzae isolates from patients with chronic
obstructive pulmonary disease (Dawid et al., 1999; Cholon
et al., 2008). These variations can affect mRNA synthesis
and thereby influence the expression of corresponding proteins. For example, an increasing number of TR units
resulted in a gradual decrease in specific mRNA synthesis
and protein expression and vice versa. However, the
underlying molecular mechanism remains obscure.
The phenotypical impact of intragenic
TRs in bacteria
There is abundant evidence that besides intergenic TRs,
also intragenic TRs can trigger phase variation (Table 2).
However, the underlying mechanisms are dependent on
the nature of the TR. More specifically, if the TR unit size
is not a multiple of three, rearrangements are able to
induce frameshift mutations as the cause of ON–OFF
phase variation. In comparison, phase variation induced
by TRs whose unit size is a multiple of three is more
complex and is probably related to specific structural and
functional alterations of the corresponding proteins
(Gemayel et al., 2010). In this section, we review studies
on the phenotypical impact of intragenic TRs, grouped
according to their location in different functional classes
of genes.
Cell surface structural genes with TRs
It has been noted that TRs are most abundant in genes
whose products are either exposed on the cell surface or
involved in the biogenesis of cellular surface structures,
FEMS Microbiol Rev 38 (2014) 119–141
127
Variable tandem repeats and bacterial adaptation
Table 2. Examples of intragenic TRs causing phase switching in bacteria
Affected moiety
Bacterial species
Gene(s) or operon
Repeat motif* (5′–3′)
References
Adhesin
Haemophilus influenzae
Helicobacter pylori
Neisseria gonorrhoeae
Mycoplasma hominis
Legionella pneumophila
Streptococcus pneumoniae
Neisseria meningitidis
Escherichia coli
Xanthomonas campestris
Haemophilus influenzae
Neisseria meningitidis
cha
sabA
opa
vaa
lcl
cps15bM
cap8E
tts
siaD
neuO
avrBs3
hgp
hpuA, hmbR
56 bp
CT
CTCTT
A
45 bp†
TA
223 bp
22 bp
C
AAGACTC
102 bp
CCAA
G
Campylobacter coli
Mycoplasma hyorhinis
Mycoplasma bovis
Mycoplasma pulmonis
Haemophilus influenzae
Haemophilus influenzae
Haemophilus influenzae
Neisseria meningitidis
flhA
vlp
vspA
vsa
losA
lic1, lic2 A, lic3 A
lex2, oafA
lgtA, C, D
T
24/26 bp†
18/24 bp†
34 bp
CGAGCATA
CAAT
GCAA
G
Helicobacter pylori
futA, futB
C & 21 bp‡
Escherichia coli
Escherichia coli
Salmonella Typhimurium
Neisseria meningitidis
Group B streptococci
Mycoplasma fermentans
Escherichia coli
Escherichia coli
Neisseria gonorrhoeae
Legionella pneumophila
Listeria monocytogenes
xylB
mutL
mutL
porA
bca
p78
tolA
ahpC
pilC
fimV
ctsR
prfA
hsdM
hsdS
modA
modA
modB
modD
res
modH
mod
mrr
G
CTGGCG
GCTGGC
G
246 bp
A
15-18 bp†
TCT
G
18 bp†
GGT
CAGGAGT
GACGA
G
AGTC
AGCC
CCCAA
ACCGA
C
G
CACAG
G
Sheets & St Geme (2011)
Goodwin et al. (2008)
Murphy et al. (1989)
Zhang & Wise (1997)
Vandersmissen et al. (2010)
van Selm et al. (2003)
Waite et al. (2003)
Waite et al. (2003)
Hammerschmidt et al. (1996b)
Deszo et al. (2005)
Herbers et al. (1992), Kay et al. (2007)
Jin et al. (1999), Ren et al. (1999)
Lewis et al. (1999), Richardson &
Stojiljkovic (1999)
Park et al. (2000)
Rosengarten & Wise (1991)
Lysnyansky et al. (1996)
Bhugra et al. (1995)
Erwin et al. (2006)
Hosking et al. (1999), Dixon et al. (2007)
Griffin et al. (2003), Fox et al. (2005)
Yang & Gotschlich (1996), Jennings
et al. (1999)
Appelmelk et al. (1999), Nilsson
et al. (2008)
Funchain et al. (2000)
Shaver & Sniegowski (2003)
Chen et al. (2010)
van der Ende et al. (2000)
Gravekamp et al. (1996, 1997)
Theiss & Wise (1997)
Zhou et al. (2012a, b)
Ritz et al. (2001)
Jonsson et al. (1991, 1992)
Coil & Anne (2010)
Karatzas et al. (2003, 2005)
Lindb€
ack et al. (2011)
Zaleski et al. (2005)
Adamczyk-Poplawska et al. (2011)
Srikhanta et al. (2005)
Srikhanta et al. (2009)
Srikhanta et al. (2009)
Seib et al. (2011)
de Vries et al. (2002)
Srikhanta et al. (2011)
Ryan & Lo (1999)
Tesfazgi Mebrhatu et al. (2011)
Capsule
Effector
Fe binding
Flagellin
Lipoprotein
Lipopolysaccharides
Metabolism
Mismatch repair
Outer membrane protein
Inner membrane protein
Peroxiredoxin
Pilus
Regulator
R-M§ system I
R-M§ system III
R-M§ system IV
Haemophilus influenzae
Neisseria gonorrhoeae
Haemophilus influenzae
Neisseria spp.
Neisseria spp.
Neisseria meningitidis
Helicobacter pylori
Helicobacter pylori
Pasteurella haemolytica
Escherichia coli
TR, tandem repeat.
*The motif sequences of microsatellites (≤ 9 bp) are listed; for longer TRs, only the length is given.
†
Different repeat unit lengths or sequences have been reported for this TR locus.
‡
Two TR loci exist at different positions in the same gene.
§
Restriction–modification system.
such as lipopolysaccharides (LPS), adhesins, pili, fimbriae,
and capsules (Moxon et al., 1994; Jordan et al., 2003;
Verstrepen et al., 2004; Gibbons & Rokas, 2009; Janulczyk
FEMS Microbiol Rev 38 (2014) 119–141
et al., 2010; Jerome et al., 2011). Extensive studies in different organisms and with different cell surface genes
support the notion that stochastic TR-based switching
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
128
contributes to the rapid generation of diversity in surface
structures, which in pathogens can serve as a mechanism
for escaping the immune response and/or for determining
tissue tropism. Some examples are reviewed more elaborately below.
TRs within LPS biosynthesis genes
LPS is a complex macromolecule in Gram-negative bacteria that is composed of three distinct parts (i.e. lipid A,
core sugar, and O-antigen side chains), and a large number of genes are involved in the synthesis and export of
LPS. As one of the major cell surface antigens, LPS is
often implicated in cell adhesion and virulence. Furthermore, because LPS molecules are an essential structural
component of the outer membrane that forms the outer
shell of the Gram-negative cell, it also determines bacterial resistance to a variety of toxic chemicals, including
some antibiotics and xenobiotics. A notable feature of
LPS of some pathogens is the extensive intra- and interstrain heterogeneity of the glycoform structure (i.e. the
moieties comprising the core and O side chain sugars),
which is mainly due to incomplete biosynthesis during
the stepwise assembly of the sugar residues resulting from
the phase-variable expression of the corresponding
biosynthesis genes (Schweda et al., 2007). Typically, the
phase variability of these genes derives from the fact that
they contain nontrimeric TR tracts that exhibit stochastic
variation. In some pathogens, this type of stochastic variation occurs in more than one gene for LPS synthesis,
turning phase switching into a combinatorial process.
Unlike that of most Gram-negative bacteria, the core
LPS of H. influenzae lacks the homopolymeric sugar units
comprising the O-antigen side chains. Therefore, phase
variation in LPS biosynthesis of H. influenzae occurs
mainly through reversible expression switching of a subset
of TR-containing genes, involved in the addition of core
sugars (i.e. glucose and sialic acid) to the conserved
tri-heptose backbone and in the addition of phosphorylcholine or acyl groups to these core sugars (Moxon
et al., 2006; Fig. 5a). These genes include lic1A, lic1B,
lic1C, and lic1D (collectively designated as lic1 locus), and
lic2A, lic3A, lgtC, lex2, and oafA, each of which contains a
variable tetrameric repeat tract (reviewed in Schweda
et al., 2007). Rearrangements in the tetrameric repeat
tracts of these genes can individually turn their expression
on or off, resulting in a modified LPS molecule at the
single cell level, and a repertoire of different LPS epitopes
throughout the population (Fig. 5b; Moxon et al., 1994;
Bayliss et al., 2001; Moxon et al., 2006). Remarkably, all
the phase-variable genes involved in LPS biosynthesis of
H. influenzae account for structural elements directly relevant to virulence as well (Schweda et al., 2007).
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
Interestingly, searching for genes harboring tetrameric
TRs has proven to be a successful strategy to identify
novel LPS biosynthesis genes in H. influenzae genomes
(Hood et al., 1996; Fox et al., 2005). Similar mechanisms
generating LPS diversity have also been described in
N. meningitidis, where the ON/OFF variation in expression of genes of the lgt gene family (encoding LPS biosynthesis functions) is mediated by stochastic expansions
or contractions in their intragenic homopolymeric tracts
and where the corresponding LPS structures have so far
been classified into 12 immunotypes (Jennings et al.,
1999; Berrington et al., 2002; Bayliss et al., 2008).
The O-chain of H. pylori LPS can be fucosylated, thereby
generating Lewis antigens that mimic human blood group
antigens and mediate immune evasion (reviewed in Kusters
et al., 2006). Lewis antigens of H. pylori are subject to
reversible, high-frequency phase variation, and one of the
mechanisms is the slipped-strand mispairing of poly-C
tracts within three fucosyltransferase genes (futA, futB, and
futC; Appelmelk et al., 1999; Wang et al., 1999). In addition, a unique TR region with a 21-bp unit size was recently
uncovered in the 3′ end of futA and futB, but not in futC.
Strikingly, although copy number variations of the 21-bp
TR tract in both futA and futB did not alter the reading
frame, they could affect the fucosyltransferase activity. In
fact, a correlation was observed between the copy number
of the 21-bp TRs and the number of O-antigen units being
fucosylated, and the addition of one repeat unit led to the
addition of an N-acetyl-b-lactosamine (LacNAc) unit in
the O-antigen polysaccharide (Nilsson et al., 2006, 2008).
These studies show that the variability of TRs in futA, futB,
and futC increases the antigen diversity and population
heterogeneity and thereby supports adaptability of
H. pylori to fluctuating conditions in the gastric mucosa.
TRs within capsule biosynthesis genes
As one of the most external structures on the bacterial
surface, capsules may completely conceal other antigenic
surface molecules or may be co-exposed with other antigens, which are thought to be important for pathogenicity
of bacteria. Production of a capsule provides some pathogenic bacteria with resistance to phagocytic and complement-mediated killing and, at the same time, affects
bacterial attachment to host cells. Consequently, the ability
to regulate capsule expression might confer a selective
advantage for pathogens to cope with host immune
responses during different stages of the infection process.
For example, it was found that acapsulate variants of
N. meningitidis serogroup B show much higher adherence
and invasion of epithelial cells than their capsulated progenitors. Such variants can be generated at high frequency
due to SSM of a poly-C tract in the polysialyltransferase
FEMS Microbiol Rev 38 (2014) 119–141
129
Variable tandem repeats and bacterial adaptation
PEtn
(a)
PC
lic1A
6Glc
β-1,4
4
Hep
α-1,5
Kdo
Lipid A
α-1,3
Hep6
NeuAc
α-2,3
α-1,2
lic2A
lic3A
Gal
β-1,4
PEtn
Glc
β-1,2
Hep
(b)
NeuAc
lic3A
lic2A
Gal
lic1
PC
(ON)
(OFF)
(ON)
(ON)
(ON)
(OFF)
(ON)
(OFF)
(ON)
(ON)
Type I
Type II
(OFF)
(OFF)
Type III
Type IV
Fig. 5. Structural variations of LPS molecules induced by TR rearrangements in LPS biosynthesis genes. (a) Representation of one possible
structure of the LPS from Haemophilus influenzae strain Rd. The conserved tri-heptose backbone is highlighted in bold. Three phase-variable lic
genes involved in the addition of specific components to the backbone are shown. Kdo, 2-keto-3-deoxyoctulosonic acid; Hep, L-glycero-D-mannoheptose; Glc, D-glucose; Gal, D-galactose; NeuAc, N-acetylneuraminic acid; PEtn, phosphoethanolamine; PC, phosphorylcholine. (b) Scheme
showing some of the possible LPS types generated on the cell surface of H. influenzae Rd by the combinatorial effect of three LPS synthesis
genes with phase-variable expression. lic2A is responsible for adding a galactose (Gal); lic3A, for adding sialic acid (NeuAc); and lic1, for adding
phosphorylcholine (PC). The repeat tracts are shown by hatched boxes in the gray arrow representation of each gene. The TRs in each of these
genes are subject to variation, which can cause a frameshift and thus block expression (indicated by ON or OFF). Types I–IV represent microbial
cells with different LPS antigens depending on lic1, lic3A, and lic2A gene expression (adapted from Moxon et al., 2006).
gene siaD (Hammerschmidt et al., 1996a, b; Spinosa et al.,
2007). On the other hand, meningococcus isolates from
the blood of meningitis patients are almost always capsulated, while both capsulated and noncapsulated strains
typically coexist in the nose or throat of healthy carriers.
Because phase variation of siaD is reversible, it was therefore proposed that infection is initiated by acapsulate
strains, but that capsule biosynthesis is reactivated at a
later stage during infection, allowing N. meningitidis to
resist the host immune system and to cause disease (i.e.
sepsis and meningitis). However, other findings indicate
that the presence of a capsule does not necessarily preclude
invasiveness, and the role of the capsule and capsule phase
switching in different stages of meningococcal infection
may therefore be more subtle and strain dependent
(Spinosa et al., 2007; Bartley et al., 2013).
Besides binary ON/OFF switching, some bacteria can
also modulate the composition of their capsule by a
mechanism of TR-dependent phase switching. This is
exemplified by the polysialic acid capsule (K1 antigen) of
FEMS Microbiol Rev 38 (2014) 119–141
E. coli K1, which can be modified through phase-variable
expression of the sialyl O-acetylating activity, resulting in
an altered immunogenicity and susceptibility to glycosidases. The phase-variable acetylation is driven by a heptanucleotide TR tract (AAGACTC; copy number typically
14–39) within the O-acetyltransferase gene neuO (Deszo
et al., 2005). Loss or gain of a number of repeat units
that is not a multiple of three results in a disruption of
the neuO reading frame and subsequent NeuO expression
(Deszo et al., 2005). Functional analysis furthermore
revealed enhanced desiccation resistance, but reduced biofilm formation in E. coli K1 with active NeuO, suggesting
not only a role in host interaction, but also a more subtle
ecological impact of phase-variable neuO expression
(Mordhorst et al., 2009). Interestingly, each set of three
repeat units encodes a protein structure designated the
poly(w) motif, and NeuO enzymatic activity was found
to increase with the number of poly(w) motifs, supporting maintenance of high repeat copy numbers in the
population (Bergfeld et al., 2007).
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
130
Another intriguing aspect of this capsular acetylation is
that the neuO gene resides in a lambdoid prophage
termed ‘CUS-3’, and mitomycin treatment of E. coli K1
can induce release of this phage, which specifically infects
K1 antigen-expressing bacteria (Deszo et al., 2005; King
et al., 2007). This suggests that variant neuO alleles can
be redistributed among K1-encapsulated bacteria via horizontal transfer. Indeed, CUS-3/neuO has been found in
several serotypes, such as O18 and O45 (King et al.,
2007). On the other hand, although the receptor for
CUS-3 is polysialic acid, superinfection was not prevented
by NeuO-mediated acetylation (Vimr & Steenbergen,
2006; King et al., 2007). Notably, sialic acid is also known
as a modification of LPS in numerous pathogens, underscoring the vital role of O-acetylation in causing structure
variation of polysaccharide epitopes (also see TRs within
LPS biosynthesis genes).
Also the mini- and macrosatellites within capsule
biosynthesis genes of Gram-positive pathogens can mediate capsule diversity. As such, noncapsulated serotype 3
Streptococcus pneumoniae strains were shown to carry an
out-of-frame perfect tandem duplication in one of the
capsule biosynthesis genes (i.e. cap3A). Interestingly,
based on the sequence and length of the duplication, at
least seven different cap3A alleles were found in different
nonencapsulated strains. Analysis of the phase reversion
frequency (OFF to ON) induced by TR contractions
revealed a positive correlation between the frequency of
reversion and the length of the duplication (Waite et al.,
2001). A similar mechanism of capsular phase variation
correlating with a 223-bp and a 22-bp perfect tandem
duplication in cap8E and tts was also demonstrated in
serotype 8 and serotype 37, respectively, of S. pneumoniae
(Waite et al., 2003).
TRs within adhesin-associated genes
Adhesins mediate bacterial attachment to and further
colonization of host tissues, but they also act as surface
antigens (Bayliss et al., 2001). The opacity proteins Opa
of Neisseria spp. constitute a family of closely related, but
size-variable outer membrane proteins, which enhance
the adherence to epithelial, leukocyte, and phagocytic cells
(reviewed in Sadarangani et al., 2011). They do not only
determine host and tissue specificity, but also facilitate
efficient cellular invasion (Carbonnelle et al., 2009).
Neisseria spp. strains have 3–11 opa genes whose phasevariable expression is modulated by a pentanucleotide TR
tract (CTCTT) in their coding regions or by intergenic
recombination. As a result, a vast array of Opa variants
can be generated to confer differential molecular specificities, allowing Neisseria both to alter its tissue tropism and
to escape the host immune system. In fact, it has been
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
argued that the occurrence of different arrays of Opa
variants in clinical (disease) and carriage (nondisease) isolates of N. meningitidis may be the result of immune
selection pressure (Callaghan et al., 2006, 2008; Sadarangani et al., 2011).
In the same vein, adhesins of H. pylori are categorized
into two main subgroups, Hop (Helicobacter outer membrane porins) and Hor (Hop-related). For some, but not
all of these adhesins, the corresponding host cell receptors
have been identified (reviewed in Backert et al., 2011).
For example, BabA binds to the Lewis B (Leb) antigen
expressed on the gastric mucosa (Ilver et al., 1998), and
SabA binds to glycosphingolipids of host cells that display
a sialyl-dimeric Lewis X (sialyl-Lex) antigen (Mahdavi
et al., 2002). Interestingly, expression of some of the
adhesins (i.e. BabB, HopZ, SabA, OipA) is subject to ON/
OFF phase variation due to the presence of variable dinucleotide (CT) tracts within the corresponding genes. As a
consequence, the expression patterns of adhesins generated by stochastic phase variation can not only affect the
adhesion efficiency, but also alter the tissue tropism
(Yamaoka et al., 2002; Backert et al., 2011). Additionally,
the resulting repertoire of phase-variable adhesins can be
advantageous for H. pylori to escape the host immune
system.
A final example in this category is the Eap protein of
the Gram-positive pathogen Staphylococcus aureus. Eap is
a multifunctional adhesin with a variable TR tract, in
which the size of each repeat unit is not identical (93–
110 aa). A minimum of two TRs in the eap gene are
required for Eap to cause agglutination, adherence and
cellular invasion by S. aureus. Furthermore, as the repeat
copy number increases from 2 to 5, those capacities are
significantly enhanced, suggesting that TR copy number
expansion in eap supports host adaptation of S. aureus
(Hussain et al., 2008).
TRs within iron (heme) acquisition genes
Free iron is typically limited in the host and is usually
sequestered by iron-binding proteins (e.g. hemoglobin,
transferrin, and lactoferrin). Iron acquisition mechanisms
are therefore considered indispensable for bacterial pathogenicity, and pathogens have evolved many different
strategies for adaptation to fluctuating iron concentrations. As such, pathogens are able to extract the iron
from heme groups of iron-binding proteins via surface
receptors. In H. influenzae, a family of hemoglobin-binding and hemoglobin–haptoglobin-binding proteins (Hgp)
is known to mediate heme scavenging (Ren et al., 1998;
Jin et al., 1999; Morton et al., 1999). Individual strains of
H. influenzae have 1–4 hgp genes, of which knockout
analysis has confirmed that they are indispensable
FEMS Microbiol Rev 38 (2014) 119–141
Variable tandem repeats and bacterial adaptation
virulence determinants in invasive disease (Seale et al.,
2006). Interestingly, a CCAA tetranucleotide repeat tract
exists in each hgp gene, which is reminiscent of the LPS
biogenesis genes of H. influenzae harboring a CAAT
repeat tract (Moxon et al., 2006; see TRs within LPS
biosynthesis genes). Phase variation induced by repeat
rearrangements within the hgp genes has been observed;
however, its biological significance is still obscure.
A similar case exists in N. meningitidis where iron
acquisition is modulated by the variable poly-G tract in
the hpuA and hmbR genes that are involved in the biosynthesis of two hemoglobin receptors (Lewis et al., 1999;
Richardson & Stojiljkovic, 1999). More recently, it was
revealed that 91% of N. meningitidis pathogenic isolates,
but only 71% of commensal isolates have at least one
receptor in an ON state, suggesting expression of hemoglobin receptor(s) to be important for the systemic spread
of meningococci (Tauseef et al., 2011).
TRs within genes involved in restriction–
modification systems
In addition to their role as drivers of genetic variation as
reviewed above, TRs can also drive epigenetic variation
when they affect restriction–modification (R-M) systems.
Currently, TR-dependent phase variations have been documented only in type I and III R-M systems. Type I R-M
systems are generally comprised of three subunits (S, M,
and R), which together form a holoenzyme that has both
methylation and restriction activity. In H. influenzae, the
type I R-M system HindI functions as the main defense
system against the entry of foreign DNA (Glover & Piekarowicz, 1972; Piekarowicz et al., 1974). Phase variation
of HindI is driven by a GACGA pentanucleotide repeat
tract in the methyltransferase subunit gene (hsdM), as
supported by the finding that an hsdM allele with four
repeat units encoding an HsdM protein of normal length
was associated with resistance to phage HP1c1 infection,
while gain or loss of one repeat unit resulted in phage
sensitivity (Zaleski et al., 2005). Interestingly, phase
switching of phage susceptibility could also independently
be conferred by LPS alterations induced by the variable
tetranucleotide repeat tract of the lic2A gene, which confirmed the role of Lic2A-modified LPS as the receptor of
HP1c1 phage (Zaleski et al., 2005; see also TRs within
LPS biosynthesis genes). Moreover, the frequencies of the
phase variations of phage susceptibility described above
are affected by Dam (deoxyadenosine methyltransferase)
activity, although the mechanism is not clear yet (Zaleski
et al., 2005).
Another example was found in the NgoAV type I R-M
system of N. gonorrhoeae, which is encoded by four genes:
hsdMNgoAV, hsdRNgoAV, hsdSNgoAV1, and hsdSNgoAV2. The
FEMS Microbiol Rev 38 (2014) 119–141
131
product of the hsdSNgoAV1 is responsible for the specific
recognition of the target site, whereas hsdSNgoAV2 is
nonfunctional. It was postulated that hsdSNgoAV1 and
hsdSNgoAV2 are actually truncated proteins derived from
an integral hsdS locus that has become interrupted by a
frameshift mutation resulting in the formation of a stop
codon between hsdSNgoAV1 and hsdSNgoAV2 (Piekarowicz
et al., 2001). Interestingly, a recent study uncovered a
variable poly-G tract within the 3′ end of hsdSNgoAV1 in
N. gonorrhoeae, and loss of a guanine in that tract
restores the fusion of the hsdSNgoAV1 and hsdSNgoAV2
genes, resulting in the generation of a new HsdSNgoAVD
protein, responsible for a novel NgoAV R-M system,
termed ‘NgoAVD’. The NgoAVD system has a modified
DNA recognition specificity, thereby conferring an altered
susceptibility to various phages (Adamczyk-Poplawska
et al., 2011).
Type III R-M systems only consist of two subunits, the
methyltransferase (encoded by a mod gene) and
restriction endonuclease (encoded by a res gene). In
host-adapted pathogens, ON/OFF phase variation of this
system has been reported due to variable TRs (with
repeat unit not being a multiple of three bases) within
either the mod or the res gene (Ryan & Lo, 1999; De Bolle
et al., 2000; de Vries et al., 2002; Srikhanta et al., 2005,
2009, 2011; also see review Srikhanta et al., 2010). Interestingly, TR-induced phase variation in the mod gene has
been shown to affect the expression of a number of genes,
referred to as a phase-variable regulon or phasevarion
(Srikhanta et al., 2005, 2010). In H. influenzae strain Rd,
when mod expression was switched off by a TR rearrangement in a tetrameric repeat tract, nine other genes were
down-regulated, and seven, up-regulated (Srikhanta et al.,
2005). In the same vein, N. gonorrhoeae formed significantly thicker biofilms and thus may indirectly benefit
from an increased resistance to external stresses, when its
mod was in the OFF state. Additionally, a mod-ON
phenotype resulted in an increased ability to associate
with human cervical epithelial (pex) cells, whereas the
mod-OFF configuration enhanced the ability to invade
and survive within pex cells following invasion (Srikhanta
et al., 2009). Further studies will be needed to clarify
whether these phenotypes directly relate to mod deficiency
itself or to altered expression of one or more members of
the mod phasevarion.
Interestingly, most N. meningitidis and N. gonorrhoeae
strains have a second phase-variable methyltransferase
gene (Srikhanta et al., 2009), and some, even a third
(Seib et al., 2011), and it has been anticipated that the
combinatorial use of different phasevarions may contribute to further phenotypic variability (Seib et al., 2011).
Furthermore, Fox et al. (2007) suggested that the
apparent evolution of this type III R-M system into an
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
132
K. Zhou et al.
epigenetic mechanism for controlling gene expression has
resulted in loss of the DNA restriction function in some
strains. In the latter context, it is also interesting to note
the existence of solitary type IV restriction endonucleases
that have no cognate methyltransferase. Because these
peculiar enzymes display a specificity for methylated
DNA, it has been anticipated that they might function to
ward off the lateral acquisition of methyltransferases that
might affect the cell’s epigenetic regulation (Fukuda et al.,
2008; Tesfazgi Mebrhatu et al., 2011). As such, the phase
variability of methyltransferase functions might serve to
prevent detection of their activity during their establishment in a new host (Tesfazgi Mebrhatu et al., 2011).
TRs within transcription activator-like effectors
Gram-negative plant-pathogenic bacteria of the genus
Xanthomonas can infect a broad spectrum of plants. A
common feature of their infection process is the injection
of virulence proteins (termed effectors) into host cells,
mostly by means of a type III secretion system. The type
III effectors are currently classified into nearly 40 groups
based on sequence similarity and biochemical activity,
and the largest group is the AvrBs3/PthA family, also
known as transcription activator-like effector (TALE)
family (reviewed in White et al., 2009). The most conspicuous feature of TALE genes is the variation of their
central domain, mostly consisting of 15.5–19.5 nearly
identical TRs with a 102-bp unit (G€
urlebeck et al., 2006;
Mak et al., 2013; Fig. 6).
A recent study revealed that the AvrBs3 can determine
bacterial fitness in plants during infection (Kay et al.,
2007). A set of upa (up-regulated by AvrBs3) genes among
which upa20, the key regulator of the plant cell hypertrophy phenotype, has been identified as AvrBs3 targets in
pepper plants. AvrBs3 acts as a transcription factor by
binding to a conserved promoter element (UPA box) of
upa20, resulting in up-regulation. Binding was shown to be
mediated by the TR region of AvrBs3, suggesting the
repeats to act as a DNA-binding motif (Kay et al., 2007).
More specifically, it was found that the number of base
pairs of the UPA box closely matches the TR copy number
(a)
DNA-binding TR domain
TTSS
secreƟon
signal
0 1
N
17.5
TranscripƟonal
acƟvaƟon
domain
C
HD NG NS NG NI NI NI HD HD NG NS NS HD HD HD NG HD NG
Nuclear
localizaƟon
signal
LT P E Q V VA I A S H D G G KQ A L E T V Q R L L P V L C Q A H G
1
12/13
34
(b)
UPA box T A T A T AA A C C T NN C CC T C T
UPA gene
Fig. 6. Domain organization and molecular function of Xanthomonas TALE AvrBs3. (a) The AvrBs3 functional domains shown include a type
three secretion system (TTSS) signal sequence (dark blue), a central DNA-binding TR domain, a nuclear localization signal (purple), and a
transcriptional activation domain (olive green). The TR domain in this case comprises 17.5 imperfect repeat units, each of which consists of 34
amino acids (aa). The unit numbered zero (red, dashed rectangle) is not a true repeat because it has a different aa sequence, but it also
contributes to DNA binding. Each repeat binds to a base in the target sequence, and the binding specificity of the repeats is determined by aa
12 and 13 (known as RVD) and displayed with repeat-specific colors. The complete aa sequence of one repeat (no. 9) is shown with its RVD
highlighted in green. (b) After injection into the plant cell by the bacterial type three secretion system, AvrBs3 is targeted to the nucleus and will
bind with its TR domain to a specific DNA sequence known as UPA box. The consensus UPA box matches closely the binding specificity of the TR
region of AvrBs3 as determined by the different TR units and their RVD (aa 12 and 13). AvrBs3 binding activates transcription of several UPA
genes (adapted from Mak et al., 2013).
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
FEMS Microbiol Rev 38 (2014) 119–141
133
Variable tandem repeats and bacterial adaptation
in AvrBs3. An elegant study demonstrated that each repeat
unit of AvrBs3 binds to one base pair of the UPA box and
that the base recognition specificity of each repeat is determined by two hypervariable amino acids (the 12th and
13th amino acids of each repeat unit), known as repeat variable di-residues (RVDs; Boch et al., 2009). Moreover, a
minimum of 6.5 TRs in AvrBs3 are necessary for activating
upa gene expression (Boch et al., 2009; Moscou & Bogdanove, 2009; Scholze & Boch, 2011; Fig. 6). Most recently,
the structural basis for this kind of sequence-specific recognition has been uncovered by crystallographic analysis of
an artificially engineered TAL effector hybridized with its
target DNA (Deng et al., 2012). AvrBs3-dependent modulation of plant signaling pathways causes enlargement of
mesophyll cells and hypertrophy of the infected tissue,
which might help the bacteria to proliferate and escape
from infection sites to facilitate bacterial spreading (Kay
et al., 2007; Boch & Bonas, 2010). Consequently, TR variations in the avrBs3 gene preclude activation of the UPA
box causing a failure in inducing the hypersensitive
response in plants (Kay et al., 2007).
While the above studies highlight the biological importance of TALEs as major virulence determinants, TALEs
also provide interesting avenues for biotechnological applications. As such, it will be possible to engineer broader
pathogen resistance in crops by combining several UPA
boxes into the promoter of plant resistance genes like Bs3,
which will render these transgenic plants resistant to infection by bacteria delivering matching TAL effectors (Boch
& Bonas, 2010; Scholze & Boch, 2011). Alternatively, the
TR region of TAL effectors can be tailored as to recognize
and bind to a predefined DNA sequence, resulting in activating of the expression of downstream target genes (Morbitzer et al., 2010). Moreover, engineered TAL effectors
can be fused with endonuclease domains to generate TAL
effector nucleases that can introduce cuts or double-strand
breaks in or near specific sequences on the chromosome,
targeting these loci for mutagenesis or recombinational
repair and gene therapy (Bogdanove & Voytas, 2011;
Mu~
noz Bodnar et al., 2013).
TRs within stress response genes
Regulation of stress response genes is one of the most
common strategies employed by bacteria to cope with
stresses. TRs have been identified in a number of stress
response genes (Rocha et al., 2002), and some studies
have addressed the role of these repeats in the modulation of a stress response. For example, the gene encoding
the CtsR regulator (class III stress gene repressor) of
Listeria monocytogenes carries a triplet repeat (GGT) tract
with three copies. Stochastic deletion of one triplet in the
ctsR gene results in an inactive CtsR repressor, leading to
FEMS Microbiol Rev 38 (2014) 119–141
expression of the clp genes (Karatzas et al., 2003, 2005).
This alteration confers increased resistance to high hydrostatic pressure, heat, acid, and H2O2, but attenuates virulence in L. monocytogenes (Karatzas & Bennik, 2002).
Another example is mutL, which is involved in mismatch repair in Salmonella Typhimurium and E. coli. The
functional allele of mutL carries a trimeric hexanucleotide
repeat in the region encoding the ATP-binding pocket of
the protein. Spontaneous loss of one repeat unit resulting
in a mutator phenotype due to MutL deficiency has been
observed in long-term cultures of both E. coli and
S.Typhimurium (Shaver & Sniegowski, 2003; Chen et al.,
2010). Expansion of this TR region from 3 to 4 units and
from 2 to 3 units has also been observed and, in the
latter case, caused reversion of the mutator phenotype
(Shaver & Sniegowski, 2003; Chen et al., 2010; Le Bars
et al., 2013). This genetic switching of MutL may serve as
a strategy to control the balance between genetic stability
and mutability and thus serve as an element controlling
bacterial evolution. Interestingly, not only mutL, but also
many other genes (i.e. mutT, mutY, mutS, dinJ, and ruvC)
involved in DNA repair of E. coli harbor SSRs, further
underscoring the putative regulatory role of TRs in stress
response (Rocha et al., 2002).
Some membrane proteins are essential for membrane
integrity and as such for tolerance to a variety of toxic
chemicals and other stresses. One such protein, that is present in many Gram-negative bacteria, is TolA, which harbors a variable TR tract composed of 8–16 imperfect copies
of a 15- to 18-bp repeat unit in E. coli (Levengood et al.,
1991; Zhou et al., 2012b). TR variations in TolA occurred
at a frequency of at least 6.9 9 10 5 in a clonal wild-type
population of E. coli MG1655 and were shown to modulate
stress tolerance, with the most outspoken TR-dependent
phenotype being deoxycholic acid tolerance (Zhou et al.,
2012a). However, the precise molecular mechanism underlying this phenotypic variation remains unclear.
A peculiar case is the triplet repeat (TCT) in E. coli
ahpC, where expansion of the TR tract with 1 unit converts the AhpC protein from a peroxidase into a disulfide
reductase, as demonstrated by the ability of this newly
acquired enzyme activity to restore normal growth of a
mutant lacking thioredoxin and glutathione reductase
(Ritz et al., 2001). To our knowledge, this is the only
example of an intragenic TR variation generating a truly
novel function in bacteria.
Conclusions and outlook: TRs and
bacterial evolution
The generation of mutations is the basis of evolution in
bacteria and all other living organisms. Under changing
environmental conditions, the evolution of better adapted
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
134
descendants may be essential for survival of the species,
and an increased mutation rate obviously could confer
higher survival chances. On the other hand, the generation of random mutations all over the genome also causes
an important burden because most mutations are deleterious. Therefore, to be successful, bacteria will have to
maintain a balance between genome plasticity and stability. One way to do this is to control the spatial distribution of mutations over the genome by evolving highly
mutable sequences that are associated with loci (also
known as contingency loci) that are needed for a flexible
response to environmental conditions or stresses, especially those that cannot be detected by conventional
bacterial sensors (e.g. phages or host receptors; Fonville
et al., 2011). TRs and their variability provide a paradigm
for this regulatory strategy of adaptability.
The formation of TRs is suggested to be a random process based on replication slippage (Levinson & Gutman,
1987); however, only some of the formed TRs are
believed to be maintained during evolution. Generally,
variable TRs tend to localize in flexible genes involved in
the biosynthesis of surface structures and with a function
in adhesion and (for pathogens) invasion, although they
are occasionally also found in genes encoding critical cellular functions like DNA replication (Moxon et al., 2006;
Guo & Mrazek, 2008). The association between TRs and
cell surface structures is suggested to allow populations to
anticipate changes in the environment in order to
enhance their survival rate (Moxon et al., 1994). This is
particularly common and critical for pathogens such as
H. influenzae and H. pylori with limited genetic information (i.e. reduced genome) to cope with complex environments (Razin et al., 1998).
As such, TR-dependent phase variation can be regarded
as a strategy for bacterial adaptation that is complementary to conventional mutations (i.e. single nucleotide
polymorphism), but has some distinct features. First, the
hypermutability (typical mutation rates of 10 2–10 5 per
generation) can be advantageous for adaption on a short
time scale, and it has been demonstrated using a theoretical model that TRs mediating stochastic switching can
evolve and be maintained under a wide range of alternating selection regimens (Palmer et al., 2013). Second, TR
rearrangements, both local and combinatorial, can effectuate alterations at both the transcriptional and the
translational levels, resulting in either binary switching
(‘ON’ and ‘OFF’) or gradual control, and this facilitates
subtle adaptation of bacterial fitness under stress. Further,
TR-dependent mechanisms operate not only in classic
genetic, but also in epigenetic pathways and from the single locus to the more global level of phasevarions, suggesting the power of TR intermediate regulation in
bacterial adaptation.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
Although rapid TR-dependent phase switching facilitates bacterial adaption on a short time scale, it is probably not necessary for bacteria to maintain hypermutable
regions in the absence of selection pressures, because of
the associated cost of DNA replication fidelity and metabolic energy. An interesting, but mostly unanswered question is how bacteria optimize the mutation rate of TRs in
phase-variable genes in a fluctuating environment. Some
intrinsic features of the TR sequence (i.e. conservation
and copy number of repeat unit) and cellular function
(i.e. DNA replication, recombination, and repair) are
known to affect the frequency of TR-dependent phase
variation and may be used to this purpose (Bayliss,
2009). On the other hand, some studies have shown a
modulation of the TR rearrangement frequency by environmental stress (Kanbashi et al., 1997; Jackson & Loeb,
2000; Rasmussen & Bj€
ork, 2001; Srikhanta et al., 2009;
Cooley et al., 2010). However, in general, it remains
unclear how environmental signals are transduced to
modulate the switching rate of TR-containing genes.
In conclusion, TRs confer local sequence flexibility in
bacterial genomes, thereby allowing targeted mutation
and evolution. The genotypic and phenotypic variations
modulated by TRs are rapid and coordinative and support the generation of substantial biological diversity on a
short time scale. In spite of a certain metabolic cost, ‘prepared genomes’ (i.e. with TR-based contingency loci;
Caporale, 1999) have a higher adaptability and thus a fitness advantage in frequently fluctuating environments.
Acknowledgements
This work was supported by the Research Foundation –
Flanders (FWO – Vlaanderen; Research project G.0289.06)
and by the KU Leuven Research Fund (project METH/07/
03).
References
Ackermann M & Chao L (2006) DNA sequences shaped by
selection for stability. PLoS Genet 2: e22.
Adamczyk-Poplawska M, Lower M & Piekarowicz A (2011)
Deletion of one nucleotide within the homonucleotide tract
present in the hsdS gene alters the DNA sequence specificity
of type I restriction-modification system NgoAV. J Bacteriol
193: 6750–9675.
Aertsen A & Michiels CW (2004) Stress and how bacteria cope
with death and survival. Crit Rev Microbiol 30: 263–273.
Aertsen A & Michiels CW (2005) Diversify or die: generation
of diversity in response to stress. Crit Rev Microbiol 31:
69–78.
Appelmelk BJ, Martin SL, Monteiro MA et al. (1999) Phase
variation in Helicobacter pylori lipopolysaccharide due to
FEMS Microbiol Rev 38 (2014) 119–141
Variable tandem repeats and bacterial adaptation
changes in the lengths of poly(C) tracts in
alpha3-fucosyltransferase genes. Infect Immun 67:
5361–5366.
Attia AS & Hansen EJ (2006) A conserved tetranucleotide
repeat is necessary for wild-type expression of the Moraxella
catarrhalis UspA2 protein. J Bacteriol 188: 7840–7852.
Backert S, Clyne M & Tegtmeyer N (2011) Molecular
mechanisms of gastric epithelial cell adhesion and injection
of CagA by Helicobacter pylori. Cell Commun Signal 9: 28.
Barenkamp SJ & Leininger E (1992) Cloning, expression,
and DNA sequence analysis of genes encoding
nontypeable Haemophilus influenzae high-molecular-weight
surface-exposed proteins related to filamentous
hemagglutinin of Bordetella pertussis. Infect Immun 60:
1302–1313.
Bartley SN, Tzeng YL, Heel K et al. (2013) Attachment and
invasion of Neisseria meningitidis to host cells is related to
surface hydrophobicity, bacterial cell size and capsule. PLoS
ONE 8: e55798.
Bayliss CD (2009) Determinants of phase variation rate and
the fitness implications of differing rates for bacterial
pathogens and commensals. FEMS Microbiol Rev 33:
504–520.
Bayliss CD, Field D & Moxon ER (2001) The simple sequence
contingency loci of Haemophilus influenzae and Neisseria
meningitidis. J Clin Invest 107: 657–662.
Bayliss CD, Hoe JC, Makepeace K, Martin P, Hood DW &
Moxon ER (2008) Neisseria meningitidis escape from the
bactericidal activity of a monoclonal antibody is mediated
by phase variation of lgtG and enhanced by a mutator
phenotype. Infect Immun 76: 5038–5048.
Bayliss CD, Bidmos FA, Anjum A et al. (2012) Phase variable
genes of Campylobacter jejuni exhibit high mutation rates
and specific mutational patterns but mutability is not the
major determinant of population structure during host
colonization. Nucleic Acids Res 40: 5876–5889.
Benson G (1999) Tandem repeats finder: a program to analyze
DNA sequences. Nucleic Acids Res 27: 573–580.
̈
Bergfeld AK, Claus H, Vogel U & Muhlenhoff M (2007)
Biochemical characterization of the polysialic acid specific
O-acetyltransferase NeuO of Escherichia coli K1. J Biol Chem
282: 22217–22227.
Berrington AW, Tan YC, Srikhanta Y, Kuipers B, van der Ley
P, Peak IR & Jennings MP (2002) Phase variation in
meningococcal lipooligosaccharide biosynthesis genes. FEMS
Immunol Med Microbiol 34: 267–275.
Bhugra B, Voelker LL, Zou N, Yu H & Dybvig K (1995)
Mechanism of antigenic variation in Mycoplasma pulmonis:
interwoven, site-specific DNA inversions. Mol Microbiol 18:
703–714.
Bi X & Liu LF (1996) A replicational model for DNA
recombination between direct repeats. J Mol Biol 256:
849–858.
Bichara M, Wagner J & Lambert IB (2006) Mechanisms of
tandem repeat instability in bacteria. Mutat Res 598:
144–163.
FEMS Microbiol Rev 38 (2014) 119–141
135
Boch J & Bonas U (2010) Xanthomonas AvrBs3 family-type III
effectors: discovery and function. Annu Rev Phytopathol 48:
419–436.
Boch J, Scholze H, Schornack S, Landgraf A, Hahn S, Kay S,
Lahaye T, Nickstadt A & Bonas U (2009) Breaking the code
of DNA binding specificity of TAL-type III effectors. Science
326: 1509–1512.
Bogdanove AJ & Voytas DF (2011) TAL effectors:
customizable proteins for DNA targeting. Science 333:
1843–1846.
Brinkmann B, Klintschar M, Neuhuber F, H€
uhne J & Rolf B
(1998) Mutation rate in human microsatellites: influence of
the structure and length of the tandem repeat. Am J Hum
Genet 62: 1408–1415.
Bzymek M & Lovett ST (2001) Instability of repetitive DNA
sequences: the role of replication in multiple mechanisms. P
Natl Acad Sci USA 98: 8319–8325.
Callaghan MJ, Jolley KA & Maiden MC (2006)
Opacity-associated adhesin repertoire in hyperinvasive
Neisseria meningitidis. Infect Immun 74: 5085–5094.
Callaghan MJ, Buckee CO, Jolley KA, Kriz P, Maiden MC &
Gupta S (2008) The effect of immune selection on the
structure of the meningococcal Opa protein repertoire. PLoS
Pathog 4: e1000020.
Caporale LH (1999) Chance favors the prepared genome. Ann
NY Acad Sci 870: 1–21.
Carbonnelle E, Hill DJ, Morand P, Griffiths NJ, Bourdoulous
S, Murillo I, Nassif X & Virji M (2009) Meningococcal
interactions with the host. Vaccine 27(suppl 2): B78–B89.
Carson SD, Stone B, Beucher M, Fu J & Sparling PF (2000)
Phase variation of the gonococcal siderophore receptor
FetA. Mol Microbiol 36: 585–593.
Chen F, Liu WQ, Eisenstark A, Johnston RN, Liu GR & Liu
SL (2010) Multiple genetic switches spontaneously
modulating bacterial mutability. BMC Evol Biol 10: 277.
Chiou CS (2010) Multilocus variable-number tandem repeat
analysis as a molecular tool for subtyping and phylogenetic
analysis of bacterial pathogens. Expert Rev Mol Diagn 10:
5–7.
Cholon DM, Cutter D, Richardson SK, Sethi S, Murphy TF,
Look DC & St Geme III JW (2008) Serial isolates of
persistent Haemophilus influenzae in patients with chronic
obstructive pulmonary disease express diminishing
quantities of the HMW1 and HMW2 adhesins. Infect
Immun 76: 4463–4468.
Choudhary OP & Trivedi S (2010) Microsatellite or simple
sequence repeat (SSR) instability depends on repeat
characteristics during replication and repair. J Cell Mol Biol
8: 21–34.
Coenye T & Vandamme P (2005) Characterization of
mononucleotide repeats in sequenced prokaryotic genomes.
DNA Res 12: 221–233.
Coil DA & Anne J (2010) The role of fimV and the
importance of its tandem repeat copy number in twitching
motility, pigment production, and morphology in Legionella
pneumophila. Arch Microbiol 192: 625–631.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
136
Cooley MB, Carychao D, Nguyen K, Whitehand L & Mandrell
R (2010) Effects of environmental stress on stability of
tandem repeats in Escherichia coli O157:H7. Appl Environ
Microbiol 76: 3398–3400.
Dawid S, Barenkamp SJ & St Geme III JW (1999) Variation in
expression of the Haemophilus influenzae HMW adhesins: a
prokaryotic system reminiscent of eukaryotes. P Natl Acad
Sci USA 96: 1077–1082.
De Bolle X, Bayliss CD, Field D, van de Ven T, Saunders NJ,
Hood DW & Moxon ER (2000) The length of a
tetranucleotide repeat tract in Haemophilus influenzae
determines the phase variation rate of a gene with
homology to type III DNA methyltransferases. Mol
Microbiol 35: 211–222.
de Vries N, Duinsbergen D, Kuipers EJ, Pot RG, Wiesenekker
P, Penn CW, van Vliet AH, Vandenbroucke-Grauls CM &
Kusters JG (2002) Transcriptional phase variation of a type
III restriction-modification system in Helicobacter pylori.
J Bacteriol 184: 6615–6623.
Deng D, Yan C, Pan X, Mahfouz M, Wang J, Zhu JK, Shi Y
& Yan N (2012) Structural basis for sequence-specific
recognition of DNA by TAL effectors. Science 335:
720–723.
Deszo EL, Steenbergen SM, Freedberg DI & Vimr ER (2005)
Escherichia coli K1 polysialic acid O-acetyltransferase gene,
neuO, and the mechanism of capsule form variation
involving a mobile contingency locus. P Natl Acad Sci USA
102: 5564–5569.
Dettman JR & Taylor JW (2004) Mutation and evolution of
microsatellite loci in Neurospora. Genetics 168: 1231–1248.
Dixon K, Bayliss CD, Makepeace K, Moxon ER & Hood DW
(2007) Identification of the functional initiation codons of a
phase-variable gene of Haemophilus influenzae, lic2A, with
the potential for differential expression. J Bacteriol 189:
511–521.
Ecevit IZ, McCrea KW, Pettigrew MM, Sen A, Marrs CF &
Gilsdorf JR (2004) Prevalence of the hifBC, hmw1A, hmw2A,
hmwC, and hia genes in Haemophilus influenzae isolates.
J Clin Microbiol 42: 3065–3072.
Eckert KA & Hile SE (2009) Every microsatellite is different:
intrinsic DNA features dictate mutagenesis of common
microsatellites present in the human genome. Mol Carcinog
48: 379–388.
Edwards YJ, Elgar G, Clark MS & Bishop MJ (1998) The
identification and characterization of microsatellites in the
compact genome of the Japanese pufferfish, Fugu rubripes:
perspectives in functional and comparative genomic
analyses. J Mol Biol 278: 843–854.
Erwin AL, Bonthuis PJ, Geelhood JL, Nelson KL, McCrea KW,
Gilsdorf JR & Smith AL (2006) Heterogeneity in tandem
octanucleotides within Haemophilus influenzae
lipopolysaccharide biosynthetic gene losA affects serum
resistance. Infect Immun 74: 3408–3414.
Fonville NC, Ward RM & Mittelman D (2011) Stress-induced
modulators of repeat instability and genome evolution.
J Mol Microbiol Biotechnol 21: 36–44.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
Foster PL (2005) Stress responses and genetic variation in
bacteria. Mutat Res 569: 3–11.
Foster PL (2007) Stress-induced mutagenesis in bacteria. Crit
Rev Biochem Mol Biol 42: 373–397.
Fox KL, Yildirim HH, Deadman ME, Schweda EK, Moxon ER
& Hood DW (2005) Novel lipopolysaccharide biosynthetic
genes containing tetranucleotide repeats in Haemophilus
influenzae, identification of a gene for adding O-acetyl
groups. Mol Microbiol 58: 207–216.
Fox KL, Srikhanta YN & Jennings MP (2007) Phase variable
type III restriction-modification systems of host-adapted
bacterial pathogens. Mol Microbiol 65: 1375–1379.
Fukuda E, Kaminska KH, Bujnicki JM & Kobayashi I (2008)
Cell death upon epigenetic genome methylation: a novel
function of methyl-specific deoxyribonucleases. Genome Biol
9: R163.
Funchain P, Yeung A, Stewart JL, Lin R, Slupska MM & Miller
JH (2000) The consequences of growth of a mutator strain of
Escherichia coli as measured by loss of function among
multiple gene targets and loss of fitness. Genetics 154: 959–970.
Gawel D, Jonczyk P, Bialoskorska M, Schaaper RM &
Fijalkowska IJ (2002) Asymmetry of frameshift mutagenesis
during leading and lagging-strand replication in Escherichia
coli. Mutat Res 501: 129–136.
Gemayel R, Vinces MD, Legendre M & Verstrepen KJ (2010)
Variable tandem repeats accelerate evolution of coding and
regulatory sequences. Annu Rev Genet 44: 445–477.
Gibbons JG & Rokas A (2009) Comparative and functional
characterization of intragenic tandem repeats in 10
Aspergillus genomes. Mol Biol Evol 26: 591–602.
Glew MD, Baseggio N, Markham PF, Browning GF & Walker
ID (1998) Expression of the pMGA genes of Mycoplasma
gallisepticum is controlled by variation in the GAA
trinucleotide repeat lengths within the 5′ noncoding regions.
Infect Immun 66: 5833–5841.
Glew MD, Browning GF, Markham PF & Walker ID (2000)
pMGA phenotypic variation in Mycoplasma gallisepticum
occurs in vivo and is mediated by trinucleotide repeat length
variation. Infect Immun 68: 6027–6033.
Glover SW & Piekarowicz A (1972) Host specificity of DNA in
Haemophilus influenzae: restriction and modification in
strain Rd. Biochem Biophys Res Commun 46: 1610–1617.
Goldstein DB & Clark AG (1995) Microsatellite variation in
North American populations of Drosophila melanogaster.
Nucleic Acids Res 23: 3882–3886.
Goodwin AC, Weinberger DM, Ford CB, Nelson JC, Snider
JD, Hall JD, Paules CI, Peek Jr RM & Forsyth MH (2008)
Expression of the Helicobacter pylori adhesin SabA is
controlled via phase variation and the ArsRS signal
transduction system. Microbiology 154: 2231–2240.
Gravekamp C, Horensky DS, Michel JL & Madoff LC (1996)
Variation in repeat number within the alpha C protein of
group B streptococci alters antigenicity and protective
epitopes. Infect Immun 64: 3576–3583.
Gravekamp C, Kasper DL, Michel JL, Kling DE, Carey V &
Madoff LC (1997) Immunogenicity and protective efficacy
FEMS Microbiol Rev 38 (2014) 119–141
Variable tandem repeats and bacterial adaptation
of the alpha C protein of group B streptococci are inversely
related to the number of repeats. Infect Immun 65: 5216–
5221.
Griffin R, Cox AD, Makepeace K, Richards JC, Moxon ER &
Hood DW (2003) The role of lex2 in lipopolysaccharide
biosynthesis in Haemophilus influenzae strains RM7004 and
RM153. Microbiology 149: 3165–3175.
Guo X & Mrazek J (2008) Long simple sequence repeats in
host-adapted pathogens localize near genes encoding
antigens, housekeeping genes, and pseudogenes. J Mol Evol
67: 497–509.
Gur-Arie R, Cohen CJ, Eitan Y, Shelef L, Hallerman EM &
Kashi Y (2000) Simple sequence repeats in Escherichia coli:
abundance, distribution, composition, and polymorphism.
Genome Res 10: 62–71.
G€
urlebeck D, Thieme F & Bonas U (2006) Type III effector
proteins from the plant pathogen Xanthomonas and their
role in the interaction with the host plant. J Plant Physiol
163: 233–255.
Hammerschmidt S, Hilse R, van Putten JP, Gerardy-Schahn R,
Unkmeir A & Frosch M (1996a) Modulation of cell surface
sialic acid expression in Neisseria meningitidis via a
transposable genetic element. EMBO J 15: 192–198.
Hammerschmidt S, M€
uller A, Sillmann H, M€
uhlenhoff M,
Borrow R, Fox A, van Putten J, Zollinger WD,
Gerardy-Schahn R & Frosch M (1996b) Capsule phase
variation in Neisseria meningitidis serogroup B by
slipped-strand mispairing in the polysialyltransferase gene
(siaD): correlation with bacterial invasion and the
outbreak of meningococcal disease. Mol Microbiol 20:
1211–1220.
Hannan A (2010) TRPing up the genome: tandem repeat
polymorphisms as dynamic sources of genetic variability in
health and disease. Discov Med 10: 314–321.
Hebert ML & Wells RD (2005) Roles of double-strand breaks,
nicks, and gaps in stimulating deletions of CTG.CAG
repeats by intramolecular DNA repair. J Mol Biol 353:
961–979.
Hebert ML, Spitz LA & Wells RD (2004) DNA double-strand
breaks induce deletion of CTG.CAG repeats in an
orientation-dependent manner in Escherichia coli. J Mol Biol
336: 655–672.
Herbers K, Conrads-Strauch J & Bonas U (1992)
Race-specificity of plant resistance to bacterial spot disease
determined by repetitive motifs in a bacterial avirulence
protein. Nature 356: 172–174.
Hood DW, Deadman ME, Jennings MP, Bisercic M,
Fleischmann RD, Venter JC & Moxon ER (1996) DNA
repeats identify novel virulence genes in Haemophilus
influenzae. Proc Natl Acad Sci USA 93: 11121–11125.
Hosking SL, Craig JE & High NJ (1999) Phase variation of
lic1A, lic2A and lic3A in colonization of the nasopharynx,
bloodstream and cerebrospinal fluid by Haemophilus
influenzae type b. Microbiology 145: 3005–3011.
Hussain M, Haggar A, Peters G, Chhatwal GS, Herrmann M,
Flock JI & Sinha B (2008) More than one tandem repeat
FEMS Microbiol Rev 38 (2014) 119–141
137
domain of the extracellular adherence protein of
Staphylococcus aureus is required for aggregation, adherence,
and host cell invasion but not for leukocyte activation.
Infect Immun 76: 5615–5623.
Ilver D, Arnqvist A, Ogren J et al. (1998) Helicobacter pylori
adhesin binding fucosylated histo-blood group antigens
revealed by retagging. Science 279: 373–377.
Iyer RR, Pluciennik A, Rosche WA, Sinden RR & Wells RD
(2000) DNA polymerase III proofreading mutants enhance
the expansion and deletion of triplet repeat sequences in
Escherichia coli. J Biol Chem 275: 2174–2184.
Jackson AL & Loeb LA (2000) Microsatellite instability
induced by hydrogen peroxide in Escherichia coli. Mutat Res
447: 187–198.
Jansen A, Gemayel R & Verstrepen KJ (2012) Unstable
microsatellite repeats facilitate rapid evolution of coding
and regulatory sequences. Genome Dyn 7: 108–125.
Janulczyk R, Masignani V, Maione D, Tettelin H, Grandi G &
Telford JL (2010) Simple sequence repeats and genome
plasticity in Streptococcus agalactiae. J Bacteriol 192:
3990–4000.
Jennings MP, Srikhanta YN, Moxon ER, Kramer M, Poolman
JT, Kuipers B & van der Ley P (1999) The genetic basis of
the phase variation repertoire of lipopolysaccharide
immunotypes in Neisseria meningitidis. Microbiology 145:
3013–3021.
Jerome JP, Bell JA, Plovanich-Jones AE, Barrick JE, Brown CT
& Mansfield LS (2011) Standing genetic variation in
contingency loci drives the rapid adaptation of
Campylobacter jejuni to a novel host. PLoS ONE 6: e16399.
Jin H, Ren Z, Whitby PW, Morton DJ & Stull TL (1999)
Characterization of hgpA, a gene encoding a haemoglobin/
haemoglobin-haptoglobin-binding protein of Haemophilus
influenzae. Microbiology 145: 905–914.
Jolivet-Gougeon A, Kovacs B, Le Gall-David S, Le Bars H,
Bousarghin L, Bonnaure-Mallet M, Lobel B, Guille F, Soussy
CJ & Tenke P (2011) Bacterial hypermutation: clinical
implications. J Med Microbiol 60: 563–573.
Jonsson AB, Nyberg G & Normark S (1991) Phase variation of
gonococcal pili by frameshift mutation in pilC, a novel gene
for pilus assembly. EMBO J 10: 477–488.
Jonsson AB, Pfeifer J & Normark S (1992) Neisseria
gonorrhoeae PilC expression provides a selective mechanism
for structural diversity of pili. P Natl Acad Sci USA 89:
3204–3208.
Jordan P, Snyder LA & Saunders NJ (2003) Diversity in
coding tandem repeats in related Neisseria spp. BMC
Microbiol 3: 23.
Kajava AV (2012) Tandem repeats in proteins: from sequence
to structure. J Struct Biol 179: 279–288.
Kanbashi K, Wang X, Komura J, Ono T & Yamamoto K
(1997) Frameshifts, base substitutions and minute deletions
constitute X ray-induced mutations in the endogenous tonB
gene of Escherichia coli K12. Mutat Res 385: 259–267.
Karatzas KA & Bennik MH (2002) Characterization of a
Listeria monocytogenes Scott A isolate with high tolerance
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
138
towards high hydrostatic pressure. Appl Environ Microbiol
68: 3183–3189.
Karatzas KA, Wouters JA, Gahan CG, Hill C, Abee T &
Bennik MH (2003) The CtsR regulator of Listeria
monocytogenes contains a variant glycine repeat region that
affects piezotolerance, stress resistance, motility and
virulence. Mol Microbiol 49: 1227–1238.
Karatzas KA, Valdramidis VP & Wells-Bennik MH (2005)
Contingency locus in ctsR of Listeria monocytogenes Scott A:
a strategy for occurrence of abundant piezotolerant isolates
within clonal populations. Appl Environ Microbiol 71:
8390–8396.
Kassai-Jager E, Ortutay C, T
oth G, Vellai T & Gaspari Z
(2008) Distribution and evolution of short tandem repeats
in closely related bacterial genomes. Gene 410: 18–25.
Kay S, Hahn S, Marois E, Hause G & Bonas U (2007) A
bacterial effector acts as a plant transcription factor and
induces a cell size regulator. Science 318: 648–651.
Kelkar YD, Strubczewski N, Hile SE, Chiaromonte F, Eckert
KA & Makova KD (2010) What is a microsatellite: a
computational and experimental definition based upon
repeat mutational behavior at A/T and GT/AC repeats.
Genome Biol Evol 2: 620–635.
King MR, Vimr RP, Steenbergen SM, Spanjaard L, Plunkett G
III, Blattner FR & Vimr ER (2007) Escherichia coli
K1-specific bacteriophage CUS-3 distribution and function
in phase-variable capsular polysialic acid O acetylation. J
Bacteriol 189: 6447–6456.
Kornberg A, Bertsch LL, Jackson JF & Khorana HG (1964)
Enzymatic synthesis of deoxyribonucleic acid, XVI.
oligonucleotides as templates and the mechanism of their
replication. Proc Natl Acad Sci USA 51: 315–323.
Kusters JG, van Vliet AHM & Kuipers EJ (2006) Pathogenesis of
Helicobacter pylori infection. Clin Microbiol Rev 19: 449–490.
Lafontaine ER, Cope LD, Aebi C, Latimer JL, McCracken GH &
Hansen EJ (2000) The UspA1 protein and a second type of
UspA2 protein mediate adherence of Moraxella catarrhalis to
human epithelial cells in vitro. J Bacteriol 182: 1364–1373.
Lafontaine ER, Wagner NJ & Hansen EJ (2001) Expression of
the Moraxella catarrhalis UspA1 protein undergoes phase
variation and is regulated at the transcriptional level. J
Bacteriol 183: 1540–1551.
Lai Y & Sun F (2003) The relationship between microsatellite
slippage mutation rate and the number of repeat units. Mol
Biol Evol 20: 2123–2131.
Le Bars H, Bousarghin L, Bonnaure-Mallet M &
Jolivet-Gougeon A (2013) Role of a short tandem leucine/
arginine repeat in strong mutator phenotype acquisition in
a clinical isolate of Salmonella Typhimurium. FEMS
Microbiol Lett 338: 101–106.
Leclercq S, Rivals E & Jarne P (2007) Detecting microsatellites
within genomes: significant variation among algorithms.
BMC Bioinformatics 8: 125.
Legendre M, Pochet N, Pak T & Verstrepen KJ (2007)
Sequence-based estimation of minisatellite and microsatellite
repeat variability. Genome Res 17: 1787–1796.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
Levengood SK, Beyer WF Jr & Webster RE (1991) TolA: a
membrane protein involved in colicin uptake contains an
extended helical region. P Natl Acad Sci USA 88: 5939–5943.
Levinson G & Gutman GA (1987) Slipped-strand mispairing: a
major mechanism for DNA sequence evolution. Mol Biol
Evol 4: 203–221.
Lewis LA, Gipson M, Hartman K, Ownbey T, Vaughn J &
Dyer DW (1999) Phase variation of HpuAB and HmbR,
two distinct haemoglobin receptors of Neisseria meningitidis
DNM2. Mol Microbiol 32: 977–989.
Lim KG, Kwoh CK, Hsu LY & Wirawan A (2013) Review of
tandem repeat search tools: a systematic approach to
evaluating algorithmic performance. Brief Bioinform 14:
67–81.
Lin WH & Kussell E (2012) Evolutionary pressures on simple
sequence repeats in prokaryotic coding regions. Nucleic
Acids Res 40: 2399–2413.
Lindb€ack T, Secic I & Rørvik LM (2011) A contingency locus
in prfA in a Listeria monocytogenes subgroup allows
reactivation of the PrfA virulence regulator during infection
in mice. Appl Environ Microbiol 77: 3478–3483.
Lindstedt BA (2005) Multiple-locus variable number tandem
repeats analysis for genetic fingerprinting of pathogenic
bacteria. Electrophoresis 26: 2567–2582.
Lindstedt BA (2011) Genotyping of selected bacterial
enteropathogens in Norway. Int J Med Microbiol 301: 648–653.
Liu L, Dybvig K, Panangala VS, van Santen VL & French CT
(2000) GAA trinucleotide repeat region regulates M9/pMGA
gene expression in Mycoplasma gallisepticum. Infect Immun
68: 871–876.
Lopes J, Ribeyre C & Nicolas A (2006) Complex
minisatellite rearrangements generated in the total or
partial absence of Rad27/hFEN1 activity occur in a single
generation and are Rad51 and Rad52 dependent. Mol Cell
Biol 26: 6675–6689.
Lysnyansky I, Rosengarten R & Yogev D (1996) Phenotypic
switching of variable surface lipoproteins in Mycoplasma
bovis involves high-frequency chromosomal rearrangements.
J Bacteriol 178: 5395–5401.
Mahdavi J, Sonden B, Hurtig M et al. (2002) Helicobacter
pylori SabA adhesin in persistent infection and chronic
inflammation. Science 297: 573–578.
Mak ANS, Bradley P, Bogdanove AJ & Stoddard BL (2013) Tal
effectors: function, structure, engineering and applications.
Curr Opin Struct Biol 23: 93–99.
Malkova A & Haber JE (2012) Mutations arising during repair
of chromosome breaks. Annu Rev Genet 46: 455–473.
Martin P, van de Ven T, Mouchel N, Jeffries AC, Hood DW &
Moxon ER (2003) Experimentally revised repertoire of
putative contingency loci in Neisseria meningitidis strain
MC58: evidence for a novel mechanism of phase variation.
Mol Microbiol 50: 245–257.
Martin P, Makepeace K, Hill SA, Hood DW & Moxon ER
(2005) Microsatellite instability regulates transcription factor
binding and gene expression. P Natl Acad Sci USA 102:
3800–3804.
FEMS Microbiol Rev 38 (2014) 119–141
Variable tandem repeats and bacterial adaptation
Massey RC & Buckling A (2002) Environmental regulation of
mutation rates at specific sites. Trends Microbiol 10: 580–
584.
Merkel A & Gemmell N (2008) Detecting short tandem
repeats from genome data: opening the software black box.
Brief Bioinform 9: 355–366.
Metruccio MM, Pigozzi E, Roncarati D, Berlanda Scorza F,
Norais N, Hill SA, Scarlato V & Delany I (2009) A novel
phase variation mechanism in the meningococcus driven by
a ligand-responsive repressor and differential spacing of
distal promoter elements. PLoS Pathog 5: e1000710.
Miller VL, Taylor RK & Mekalanos JJ (1987) Cholera toxin
transcriptional activator toxR is a transmembrane DNA
binding protein. Cell 48: 271–279.
Morbitzer R, R€
omer P, Boch J & Lahaye T (2010) Regulation
of selected genome loci using de novo-engineered
transcription activator-like effector (TALE)-type
transcription factors. P Natl Acad Sci USA 107:
21617–21622.
Mordhorst IL, Claus H, Ewers C et al. (2009)
O-acetyltransferase gene neuO is segregated according to
phylogenetic background and contributes to environmental
desiccation resistance in Escherichia coli K1. Environ
Microbiol 11: 3154–3165.
Morton DJ, Whitby PW, Jin H, Ren Z & Stull TL (1999)
Effect of multiple mutations in the hemoglobin- and
hemoglobin-haptoglobin-binding proteins, HgpA, HgpB,
and HgpC, of Haemophilus influenzae type b. Infect Immun
67: 2729–2739.
Moscou MJ & Bogdanove AJ (2009) A simple cipher governs
DNA recognition by TAL effectors. Science 326: 1501.
Moxon ER, Rainey PB, Nowak MA & Lenski RE (1994)
Adaptive evolution of highly mutable loci in pathogenic
bacteria. Curr Biol 4: 24–33.
Moxon R, Bayliss C & Hood D (2006) Bacterial contingency
loci: the role of simple sequence DNA repeats in bacterial
adaptation. Annu Rev Genet 40: 307–333.
Mrazek J (2006) Analysis of distribution indicates diverse
functions of simple sequence repeats in Mycoplasma
genomes. Mol Biol Evol 23: 1370–1385.
Mrazek J, Guo X & Shah A (2007) Simple sequence repeats in
prokaryotic genomes. Proc Natl Acad Sci USA 104: 8472–
8477.
Mudunuri SB & Nagarajaram HA (2007) IMEx: imperfect
microsatellite extractor. Bioinformatics 23: 1181–1187.
Mu~
noz Bodnar A, Bernal A, Szurek B & L
opez CE (2013) Tell
me a tale of TALEs. Mol Biotechnol 53: 228–235.
Murphy GL, Connell TD, Barritt DS, Koomey M & Cannon
JG (1989) Phase variation of gonococcal protein II:
regulation of gene expression by slipped-strand mispairing
of a repetitive DNA sequence. Cell 56: 539–547.
Nilsson C, Skoglund A, Moran AP, Annuk H, Engstrand L &
Normark S (2006) An enzymatic ruler modulates Lewis
antigen glycosylation of Helicobacter pylori LPS during
persistent infection. P Natl Acad Sci USA 103: 2863–2868.
FEMS Microbiol Rev 38 (2014) 119–141
139
Nilsson C, Skoglund A, Moran AP, Annuk H, Engstrand L &
Normark S (2008) Lipopolysaccharide diversity evolving in
Helicobacter pylori communities through genetic
modifications in fucosyltransferases. PLoS ONE 3: e3811.
Orsi RH, Bowen BM & Wiedmann M (2010) Homopolymeric
tracts represent a general regulatory mechanism in
prokaryotes. Genomics 11: 102.
Palmer ME, Lipsitch M, Moxon ER & Bayliss CD (2013)
Broad conditions favor the evolution of phase-variable loci.
mBio 4: e00430–12.
Papazisi L, Gorton TS, Kutish G, Markham PF, Browning GF,
Nguyen DK, Swartzell S, Madan A, Mahairas G & Geary SJ
(2003) The complete genome sequence of the avian
pathogen Mycoplasma gallisepticum strain R(low).
Microbiology 149: 2307–2316.
P^aques F, Leung WY & Haber JE (1998) Expansions and
contractions in a tandem repeat induced by double-strand
break repair. Mol Cell Biol 18: 2045–2054.
P^aques F, Richard GF & Haber JE (2001) Expansions and
contractions in 36-bp minisatellites by gene conversion in
yeast. Genetics 158: 155–166.
Park SF, Purdy D & Leach S (2000) Localized reversible
frameshift mutation in the flhA gene confers phase
variability to flagellin gene expression in Campylobacter coli.
J Bacteriol 182: 207–210.
Pearson CE, Nichol Edamura K & Cleary JD (2005) Repeat
instability: mechanisms of dynamic mutations. Nat Rev
Genet 6: 729–742.
Piekarowicz A, Brzezi
nski R & Kauc L (1974) Host specificity
of DNA in Haemophilus influenzae: the in vivo action of the
restriction endonucleases on phage and bacterial DNA. Acta
Microbiol Pol A 27: 51–65.
Piekarowicz A, Klyz A, Kwiatek A & Stein DC (2001) Analysis
of type I restriction modification systems in the
Neisseriaceae: genetic organization and properties of the
gene products. Mol Microbiol 41: 1199–1210.
Power PM, Sweetman WA, Gallacher NJ, Woodhall MR,
Kumar GA, Moxon ER & Hood DW (2009) Simple
sequence repeats in Haemophilus influenzae. Infect Genet
Evol 9: 216–228.
Rando OJ & Verstrepen KJ (2007) Timescales of genetic and
epigenetic inheritance. Cell 128: 655–668.
Rasmussen M & Bj€
ork L (2001) Unique regulation of SclB – a
novel collagen-like surface protein of Streptococcus pyogenes.
Mol Microbiol 40: 1427–1438.
Razin S, Yogev D & Naot Y (1998) Molecular biology and
pathogenicity of mycoplasmas. Microbiol Mol Biol Rev 62:
1094–1156.
Ren Z, Jin H, Morton DJ & Stull TL (1998) hgpB, a gene
encoding a second Haemophilus influenzae hemoglobin- and
hemoglobin-haptoglobin-binding protein. Infect Immun 66:
4733–4741.
Ren Z, Jin H, Whitby PW, Morton DJ & Stull TL (1999)
Role of CCAA nucleotide repeats in regulation of
hemoglobin and hemoglobin-haptoglobin binding protein
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
140
genes of Haemophilus influenzae. J Bacteriol 181:
5865–5870.
Richard GF & P^aques F (2000) Mini- and microsatellite
expansions: the recombination connection. EMBO Rep 1:
122–126.
Richard GF, Dujon B & Haber JE (1999) Double-strand break
repair can lead to high frequencies of deletions within short
CAG/CTG trinucleotide repeats. Mol Gen Genet 261: 871–882.
Richard GF, Kerrest A & Dujon B (2008) Comparative
genomics and molecular dynamics of DNA repeats in
eukaryotes. Microbiol Mol Biol Rev 72: 686–727.
Richardson AR & Stojiljkovic I (1999) HmbR, a
hemoglobin-binding outer membrane protein of Neisseria
meningitidis, undergoes phase variation. J Bacteriol 181:
2067–2074.
Ritz D, Lim J, Reynolds CM, Poole LB & Beckwith J (2001)
Conversion of a peroxiredoxin into a disulfide reductase by
a triplet repeat expansion. Science 294: 158–160.
Rocha EP (2003) An appraisal of the potential for illegitimate
recombination in bacterial genomes and its consequences:
from duplications to genome reduction. Genome Res 13:
1123–1132.
Rocha EP, Matic I & Taddei F (2002) Over-representation of
repeats in stress response genes: a strategy to increase
versatility under stressful conditions? Nucleic Acids Res 30:
1886–1894.
Rosengarten R & Wise KS (1991) The Vlp system of
Mycoplasma hyorhinis: combinatorial expression of distinct
size variant lipoproteins generating high-frequency surface
antigenic variation. J Bacteriol 173: 4782–4793.
Ryan KA & Lo RY (1999) Characterization of a CACAG
pentanucleotide repeat in Pasteurella haemolytica and its
possible role in modulation of a novel type III
restriction-modification system. Nucleic Acids Res 27:
1505–1511.
Sadarangani M, Pollard AJ & Gray-Owen SD (2011) Opa
proteins and CEACAMs: pathways of immune engagement
for pathogenic Neisseria. FEMS Microbiol Rev 35: 498–514.
Saint-Ruf C & Matic I (2006) Environmental tuning of
mutation rates. Environ Microbiol 8: 193–199.
Sarkari J, Pandit N, Moxon ER & Achtman M (1994) Variable
expression of the Opc outer membrane protein in Neisseria
meningitidis is caused by size variation of a promoter
containing poly-cytidine. Mol Microbiol 13: 207–217.
Schaper E, Kajava AV, Hauser A & Anisimova M (2012)
Repeat or not repeat? Statistical validation of tandem repeat
prediction in genomic sequences. Nucleic Acids Res 40:
10005–10017.
Schlotterer C, Imhof M, Wang H, Nolte V & Harr B (2006)
Low abundance of Escherichia coli microsatellites is
associated with an extremely low mutation rate. J Evol Biol
19: 1671–1676.
Scholze H & Boch J (2011) TAL effectors are remote controls
for gene activation. Curr Opin Microbiol 14: 47–53.
Schug MD, Hutter CM, Wetterstrand KA, Gaudette MS,
Mackay TF & Aquadro CF (1998) The mutation rates of
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
K. Zhou et al.
di-, tri- and tetranucleotide repeats in Drosophila
melanogaster. Mol Biol Evol 15: 1751–1760.
Schweda EKH, Richards JC, Hood DW & Moxon ER (2007)
Expression and structural diversity of the lipopolysaccharide
of Haemophilus influenzae: implication in virulence. Int J
Med Microbiol 297: 297–306.
Seale TW, Morton DJ, Whitby PW, Wolf R, Kosanke SD,
VanWagoner TM & Stull TL (2006) Complex role of
hemoglobin and hemoglobin-haptoglobin binding proteins
in Haemophilus influenzae virulence in the infant rat model
of invasive infection. Infect Immun 74: 6213–6225.
Seib KL, Pigozzi E, Muzzi A, Gawthorne JA, Delany I,
Jennings MP & Rappuoli R (2011) A novel epigenetic
regulator associated with the hypervirulent Neisseria
meningitidis clonal complex 41/44. FASEB J 25: 3622–3633.
Shaver AC & Sniegowski PD (2003) Spontaneously arising
mutL mutators in evolving Escherichia coli populations are
the result of changes in repeat length. J Bacteriol 185:
6076–6082.
Sheets AJ & St Geme III JW (2011) Adhesive activity of the
Haemophilus cryptic genospecies Cha autotransporter is
modulated by variation in tandem peptide repeats. J
Bacteriol 193: 329–339.
Sia EA, Kokoska RJ, Dominska M, Greenwell P & Petes TD
(1997) Microsatellite instability in yeast: dependence on
repeat unit size and DNA mismatch repair genes. Mol Cell
Biol 17: 2851–2858.
Spinosa MR, Progida C, Tala A, Cogli L, Alifano P & Bucci C
(2007) The Neisseria meningitidis capsule is important for
intracellular survival in human cells. Infect Immun 75:
3594–3603.
Srikhanta YN, Maguire TL, Stacey KJ, Grimmond SM &
Jennings MP (2005) The phasevarion: a genetic system
controlling coordinated, random switching of expression of
multiple genes. P Natl Acad Sci USA 102: 5547–5551.
Srikhanta YN, Dowideit SJ, Edwards JL et al. (2009)
Phasevarions mediate random switching of gene expression
in pathogenic Neisseria. PLoS Pathog 5: e1000400.
Srikhanta YN, Fox KL & Jennings MP (2010) The phasevarion:
phase variation of type III DNA methyltransferases controls
coordinated switching in multiple genes. Nat Rev Microbiol
8: 196–206.
Srikhanta YN, Gorrell RJ, Steen JA, Gawthorne JA, Kwok T,
Grimmond SM, Robins-Browne RM & Jennings MP (2011)
Phasevarion mediated epigenetic gene regulation in
Helicobacter pylori. PLoS ONE 6: e27569.
Streisinger G, Okada Y, Emrich J, Newton J, Tsugita A,
Terzaghi E & Inouye M (1966) Frameshift mutations and
the genetic code. Cold Spring Harb Symp Quant Biol 31:
77–84.
Tauseef I, Harrison OB, Wooldridge KG et al. (2011)
Influence of the combination and phase variation status of
the haemoglobin receptors HmbR and HpuAB on
meningococcal virulence. Microbiology 157: 1446–1456.
Tesfazgi Mebrhatu M, Wywial E, Ghosh A, Michiels CW,
Lindner AB, Taddei F, Bujnicki JM, Van Melderen L &
FEMS Microbiol Rev 38 (2014) 119–141
Variable tandem repeats and bacterial adaptation
Aertsen A (2011) Evidence for an evolutionary antagonism
between Mrr and Type III modification systems. Nucleic
Acids Res 39: 5991–6001.
Theiss P & Wise KS (1997) Localized frameshift mutation
generates selective, high-frequency phase variation of a
surface lipoprotein encoded by a mycoplasma ABC
transporter operon. J Bacteriol 179: 4013–4022.
Treangen TJ, Abraham AL, Touchon M & Rocha EP (2009)
Genesis, effects and fates of repeats in prokaryotic genomes.
FEMS Microbiol Rev 33: 539–571.
van Belkum A (1999) Short sequence repeats in microbial
pathogenesis and evolution. Cell Mol Life Sci 56: 729–734.
van der Ende A, Hopman CT & Dankert J (2000) Multiple
mechanisms of phase variation of PorA in Neisseria
meningitidis. Infect Immun 68: 6685–6690.
van der Woude MW & Baumler AJ (2004) Phase and antigenic
variation in bacteria. Clin Microbiol Rev 17: 581–611.
van Ham SM, van Alphen L, Mooi FR & van Putten JP (1993)
Phase variation of H. influenzae fimbriae: transcriptional
control of two divergent genes through a variable combined
promoter region. Cell 73: 1187–1196.
van Passel MW & Ochman H (2007) Selection on the genic
location of disruptive elements. Trends Genet 23: 601–604.
van Selm S, van Cann LM, Kolkman MA, van der Zeijst BA &
van Putten JP (2003) Genetic basis for the structural
difference between Streptococcus pneumoniae serotype 15B
and 15C capsular polysaccharides. Infect Immun 71: 6192–
6198.
Vandersmissen L, De Buck E, Saels V, Coil DA & Anne J
(2010) A Legionella pneumophila collagen-like protein
encoded by a gene with a variable number of tandem
repeats is involved in the adherence and invasion of host
cells. FEMS Microbiol Lett 306: 168–176.
Verstrepen KJ, Reynolds TB & Fink GR (2004) Origins of
variation in the fungal cell surface. Nat Rev Microbiol 2:
533–540.
Vimr E & Steenbergen SM (2006) Mobile contingency locus
controlling Escherichia coli K1 polysialic acid capsule
acetylation. Mol Microbiol 60: 828–837.
Vogler AJ, Keys C, Nemoto Y, Colman RE, Jay Z & Keim P
(2006) Effect of repeat copy number on variable-number
tandem repeat mutations in Escherichia coli O157:H7. J
Bacteriol 188: 4253–4263.
Waite RD, Struthers JK & Dowson CG (2001) Spontaneous
sequence duplication within an open reading frame of the
pneumococcal type 3 capsule locus causes high-frequency
phase variation. Mol Microbiol 42: 1223–1232.
Waite RD, Penfold DW, Struthers JK & Dowson CG (2003)
Spontaneous sequence duplications within capsule genes
cap8E and tts control phase variation in Streptococcus
pneumoniae serotypes 8 and 37. Microbiology 149: 497–504.
FEMS Microbiol Rev 38 (2014) 119–141
141
Wang G, Rasko DA, Sherburne R & Taylor DE (1999)
Molecular genetic basis for the variable expression of
Lewis Y antigen in Helicobacter pylori: analysis of the
alpha (1,2) fucosyltransferase gene. Mol Microbiol 31:
1265–1274.
Wells RD, Dere R, Hebert ML, Napierala M & Son LS (2005)
Advances in mechanisms of genetic instability related to
hereditary neurological diseases. Nucleic Acids Res 33:
3785–3798.
White FF, Potnis N, Jones JB & Koebnik R (2009) The type III
effectors of Xanthomonas. Mol Plant Pathol 10: 749–766.
Willems R, Paul A, van der Heide HG, ter Avest AR & Mooi
FR (1990) Fimbrial phase variation in Bordetella pertussis: a
novel mechanism for transcriptional regulation. EMBO J 9:
2803–2809.
Yamaoka Y, Kita M, Kodama T, Imamura S, Ohno T, Sawai
N, Ishimaru A, Imanishi J & Graham DY (2002)
Helicobacter pylori infection in mice: role of outer
membrane proteins in colonization and inflammation.
Gastroenterology 123: 1992–2004.
Yang QL & Gotschlich EC (1996) Variation of gonococcal
lipooligosaccharide structure is due to alterations in poly-G
tracts in lgt genes encoding glycosyl transferases. J Exp Med
183: 323–327.
Yang J, Wang J, Chen L, Yu J, Dong J, Yao ZJ, Shen Y, Jin Q
& Chen R (2003) Identification and characterization of
simple sequence repeats in the genomes of Shigella species.
Gene 322: 85–92.
Yogev D, Rosengarten R, Watson-McKown R & Wise KS
(1991) Molecular basis of Mycoplasma surface antigenic
variation: a novel set of divergent genes undergo
spontaneous mutation of periodic coding regions and 5′
regulatory sequences. EMBO J 10: 4069–4079.
Zahra R, Blackwood JK, Sales J & Leach DR (2007)
Proofreading and secondary structure processing determine
the orientation dependence of CAG 9 CTG trinucleotide
repeat instability in Escherichia coli. Genetics 176: 27–41.
Zaleski P, Wojciechowski M & Piekarowicz A (2005) The role
of Dam methylation in phase variation of Haemophilus
influenzae genes involved in defence against phage infection.
Microbiology 151: 3361–3369.
Zhang Q & Wise KS (1997) Localized reversible frameshift
mutation in an adhesin gene confers a phase-variable
adherence phenotype in mycoplasma. Mol Microbiol 25:
859–869.
Zhou K, Michiels CW & Aertsen A (2012a) Variation of
intragenic tandem repeat tract of tolA modulates Escherichia
coli stress tolerance. PLoS ONE 7: e47766.
Zhou K, Vanoirbeek K, Aertsen A & Michiels CW (2012b)
Variability of the tandem repeat region of the Escherichia
coli tolA gene. Res Microbiol 163: 316–322.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved