Download http://www

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Intrinsically disordered proteins wikipedia , lookup

Protein domain wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Homology modeling wikipedia , lookup

Protein structure prediction wikipedia , lookup

List of types of proteins wikipedia , lookup

RNA-binding protein wikipedia , lookup

Transcription factor wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Transcript
http://www.unc.edu/courses/2009spring/envr/740/001 slide 1
Begin 02/10/09
In the case of lac, transcription cannot begin even when the polymerase has made a tight
complex at the promoter. That is because a tetrameric product of a regulator gene, the lacI gene,
is normally bound to the DNA, at the downstream edge of the promoter, blocking the start point.
That region is shown in detail on the next overhead:
[OH; binding of repressor at operator]
The protein is called a repressor, and the site of binding is the operator. The operator is an
example of a cis-acting control element, while the repressor protein Lac I is an example of a
trans-acting product. In effect, then, the normal state of lac operon is that it is repressed. In fact,
transcription is activated only in the presence of β-galactosides. The question of how activation
is accomplished is answered in the following overhead, which is a cartoon of the control cycle of
the lac operon:
[OH; cartoon of control cycle of lac genes]
The top panel shows normal situation with the lacI repressor occupying the operator. However,
as the overhead indicates, the repressor contains a site at which a small molecule called an
inducer can bind. In this example, the β-galactosides themselves serve as the inducers: the
system works because when β-galactosides are in the cell environment, minimal amounts may
enter the cell by diffusion through the plasma membrane without the help of the lacY product.
When a β-galactoside binds to the repressor, the repressor changes conformation in the region of
operator recognition sequence and the repressor loses its affinity for the operator and dissociates,
leaving the polymerase free to initiate transcription. This mechanism of repressor regulation is
called allosteric control.
[OH; model structure of lac repressor]
1
In this cartoon, the repressor is represented in its normal tetrameric form, with two of the four
units bound. This is actually an accurate representation – what is not shown, is that there are two
additional weaker binding sites, 410 bp downstream or 83 bp upstream from the start point.
Additional binding at one of the second sites forces the DNA into a loop and results in full
repression. Interestingly, the polymerase binding to the repressor-bound DNA is enhanced, but
as long as the repressor is present, transcription cannot commence. That has the effect of
“storing” the polymerase at the promoter site so that transcription can begin immediately in the
presence of an inducer. The next overhead shows the structure of a monomeric unit of the lac
repressor. This is a composite, because there is not a structure of a complete repressor in the
PDB. All structures in the PDB are truncated at either the oligomerization end or the DNA
binding end. N-terminal is DNA binding domain, HTH motif, with hinge domain connecting to
two core domains. Between the core domains is the binding site of the inducer. Finally, the Cterminal end contains a helix that serves to mediate the association of the monomeric proteins to
a functioning tetramer.
[OH; crystal structure of lac repressor]
The next overhead shows a ribbon diagram generated from an actual crystal structure of the
repressor, with no inducer and with an inducer analogue bound. Note color coded structural
features: α-helices in red and β-sheets in yellow. The use of an inducer analogue, rather than a
galactoside inducer is necessary in order to make a stable complex. The structure of the inducer
analogue is shown on the next overhead:
[OH; structure of IPTG; isopropylthioglucose]
The compound is referred to by the acronym IPTG, which stands for “isopropylthioglucose”.
The DNA-binding domain is missing, so this is a version of the repressor “truncated” at the
hinge. What is shown in these structures is the C-terminal helix by which the repressors
associate as tetramers:
[OH; repeat]
2
the inducer binding site and the hinge-point of attachment of the “headpiece”, which we don’t
see but which contains the DNA binding motif at the N-terminus, where the protein has been
truncated. The next overhead shows the repressor associated as a tetramer by the C-terminal
helices, which illustrates how the association works.
[OH; truncated repressors associated as tetramer]
The points of truncation are shown on just two of the tetrameric units for clarity. The left hand
panel shows the unbound repressor and the right panel shows the protein with the inducer
analogues bound. The next overhead shows the same structure in the same orientation (as closely
as I could orient it) with two of the four units selected (i.e., two of the units have been removed
by computer) to make the truncation points clearer.
[OH; 2 units selected to illustrate truncation points] use cursor to draw loop of DNA.
I was able to find a structure of a repressor dimer bound to the operator. This dimer is complexed
with “anti-inducers” – that is two ligands bound at the inducer site that freeze the repressor into a
DNA-binding conformation.
The anti-inducer is pictured as an inset – o-nitrophenylfructose, (acronym: ONPF). This structure
illustrates the hinge and the DNA-binding region, which consists of two α-helices separated by a
turn, which is a common DNA-binding motif called helix-turn-helix or referred to by the
acronym HTH motif. (Remember the use of motif. In the figure, there is a small third helical
segment just before the N-terminal.) The α-helices fit into the major groove of DNA, where they
make contacts with specific bases. Binding of the inducer in the cleft has a very subtle effect on
repressor core conformation, but it is likely that mechanism of inducer function causes the hinge
to swing to an extent that the HTH DNA-binding region no longer associates snugly with the
target sequence. The next overhead shows a comparison of monomeric units of the operatorbound repressor + anti-inducer with C-terminal truncation and the inducer-bound repressor with
3
the N-terminal truncation. I have generated the slide with the repressors as closely as possible in
the same orientation and as you can see, it is very hard to see any difference in the two core
domains.
An important and interesting question is how synthesis of the repressor is controlled? Ultimately,
the cascade of control mechanisms cannot be infinite. In fact, there is no control over the lacI
gene; however, initiation is relatively inefficient, so that there are never many excess repressor
molecules present in the cell, and inactivation of all the repressor present in the cell can be
accomplished by low concentrations of β-galactosides.
Since the lac genes would be transcribed in the absence of repressor, lac is said to be under
negative control. Many prokaryotic genes and virtually all eukaryotic genes are under positive
control; i.e., the binding of some ancillary protein at the promoter site is required before
transcription can begin. The lac operon also serves as a paradigm for positive control, because a
further condition imposed on transcription is that glucose, the preferred source of energy for E.
coli be unavailable. If glucose is unavailable, levels of cyclic AMP (cAMP), the 3,5phosphodiester of adenosine:
[OH; structure of cAMP]
become elevated, and the cAMP complexes with a protein called catabolite activator protein
(CAP) or cAMP receptor protein (CRP), which binds 16 bases upstream of the -35 consensus
sequence and increases the rate at which transcription is initiated. The CAP protein, in addition
to binding DNA, also contacts the RNA polymerase and promotes formation of the closed
complex which precedes the formation of an open complex and initiation of transcription. The
next slide illustrates the CAP-cAMP-DNA complex.
[CAP-cAMP-DNA complex] Good illustration of the mechanics of positive control and initiation
CAP is a dimer, associated at an α-helix (frequent motif, as in lac repressor) the DNA-binding
domain, HTH motif, is highlighted in blue, and the bound cAMP is in stick form. The next
4
overhead summarizes mechanisms of both positive and negative regulation of gene expression.
[OH; Genes, positive and negative gene regulation]
While the general picture described for E. coli is applicable to eukaryotic transcription, the
eukaryotic process, as expected, is much more complex and differs significantly with respect to
details, and we have background now to appreciate the contrast. As I have already indicated in
describing termination, there are three eukaryotic RNA polymerases: RNA I polymerase I
synthesizes rRNA; RNA polymerase III synthesizes tRNA and RNA polymerase II synthesizes
hnRNA, which is the transcript from which the mature mRNA is derived. Different types of
promoter are involved in initiation by the different polymerases and in contrast to prokaryotes,
the polymerases themselves are not directly involved in the initial recognition event, but depend
on the prior assembly of proteins called transcription factors at the promoter sites. The following
overhead gives some idea of what is involved.
[OH; assembly of initiation complex for pol II]
Since our interest lies with mRNA pol II, and because of time, we will outline initiation process
for RNA pol II. The significance of TATA and the sequence at the start point we will mention
after the next overhead. A more detailed description this process is given in Ch. 25 of Genes IX.
Polymerase II has more than 10 subunits, which are summarized on the next overhead.
[OH; RNA pol II subunits]
The two largest subunits correspond in function to the β and β´ units the prokaryotic polymerase,
which bind DNA and make up the catalytic core. At its C-terminal domain (acronym CTD), the β
subunit contains the consensus sequence of seven amino acids YSPTSPS (single letter code: Y=
Tyr, S=Ser, T=Thr, P=Pro) which may be present in up to 50 repeats in mammals. The presence
of the repeats is critical – deletion of half is a lethal mutation. Phosphorylation of the serine (S)
and threonine (T) residues is involved in the initiation reaction. Phosphorylation is a frequent
mechanism that acts as a switch or trigger via change in conformation, either to initiate or
5
terminate activity. Eukaryotic promoters generally contain a consensus sequence similar to the 10 sequence of prokaryotes, called, in eukaryotes, the TATA box, because of the consensus:
ATATAA, which is located -25 bp from the start point. The TATA box appears to function to
locate the transcription complex correctly with respect to the start point. The TATA sequence,
the exact composition of the box, is more variable in eukaryotes than the -10 consensus sequence
in prokaryotes, but this element is always present in eukaryotic promoters and its position
relative to the start point is relatively closely conserved. Eukaryotic start points have the general
form Py2CAPy5 with A at the +1 position. The TATA box is recognized by a complex of
proteins called transcription factor IID (acronym TFIID). Key in the TFIID complex is a small
protein called the TATA-binding protein, usually referred to by the acronym TBP. The
remaining components of TFIID are referred to in aggregate as the TBP-associated factors, or
TAFs. The assembly procedure is complex, and involves numerous factors, as illustrated in the
overhead, and variation in the TAFs confers specificity of the complex for different promoters.
The next overhead shows a crystal structure of the TATA-binding protein complexed to ~1 turn
of DNA. The panel on the bottom is a stick representation of a DNA strand showing that the
consensus sequence is really there (its always interesting to check to see if the published crystal
structure obeys the rules).
[OH; structure of TBP]
The take-home lesson is that polymerase II does not become involved until considerable protein
scaffolding has already been constructed. Finally, the consensus repeats in the CTD of the  unit
are phosphorylated, initiating transcription.
[OH; detail of the initiation by pol II]
In addition to being involved in initiation of transcription, the phosphorylated CTD also appears
to be an anchor for enzymes involved in capping, splicing and other hnRNA-processing
enzymes. As the polymerase leaves the promoter region, most of the transcription factors
dissociate from the complex. There is a reason for being generally aware of the functions and
6
number of transcription factors involved in initiation of hnRNA transcription – genes coding for
several of the transcription factors turn out to be oncogenes, and as we mentioned at the outset of
the course, one of the formidable challenges in chemical carcinogenesis is to correlate
chemically induced changes in structure and function that transform proto-oncogenes into
oncogenes. In addition to the TATA box, other elements in the eukaryotic promoter have been
identified that modulate efficiency of transcription. In the β-globin gene in the top panel of the
next overhead, as an example, the elements are the CAAT box, centered at -75, and the GC box
at -90 from the start point. The consensus sequences were identified by saturation mutation
experiments (as for the ARS sequence in eukaryotic replication origins). In general, these
elements function in either orientation, they are located upstream from the start point, but
location with respect to the start point is highly variable and they may also occur in varying
multiplicities and combinations, with not all elements required to be present. The CAAT box
seems to control efficiency of transcription, but not specificity. Both specificity and efficiency
seem to be conferred by up-stream elements recognized by a set of factors called activators.
These elements are generally upstream from the promoter. One such consensus sequence is the
octamer.
[OH; diagram of promoter elements]
The bottom panel shows examples of the combinations of elements that may be present in a
promoter: TATA box, two CAAT boxes; or a TATA box, two GC boxes and a CAAT box.
Notice that TATA is always present, and the distance between TATA and the start point is
constant. These elements all bind various factors, which become part of the assemblage of
proteins that covers the promoter prior to initiation.
7