Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Intrinsically disordered proteins wikipedia , lookup
Protein domain wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Homology modeling wikipedia , lookup
Protein structure prediction wikipedia , lookup
List of types of proteins wikipedia , lookup
RNA-binding protein wikipedia , lookup
http://www.unc.edu/courses/2009spring/envr/740/001 slide 1 Begin 02/10/09 In the case of lac, transcription cannot begin even when the polymerase has made a tight complex at the promoter. That is because a tetrameric product of a regulator gene, the lacI gene, is normally bound to the DNA, at the downstream edge of the promoter, blocking the start point. That region is shown in detail on the next overhead: [OH; binding of repressor at operator] The protein is called a repressor, and the site of binding is the operator. The operator is an example of a cis-acting control element, while the repressor protein Lac I is an example of a trans-acting product. In effect, then, the normal state of lac operon is that it is repressed. In fact, transcription is activated only in the presence of β-galactosides. The question of how activation is accomplished is answered in the following overhead, which is a cartoon of the control cycle of the lac operon: [OH; cartoon of control cycle of lac genes] The top panel shows normal situation with the lacI repressor occupying the operator. However, as the overhead indicates, the repressor contains a site at which a small molecule called an inducer can bind. In this example, the β-galactosides themselves serve as the inducers: the system works because when β-galactosides are in the cell environment, minimal amounts may enter the cell by diffusion through the plasma membrane without the help of the lacY product. When a β-galactoside binds to the repressor, the repressor changes conformation in the region of operator recognition sequence and the repressor loses its affinity for the operator and dissociates, leaving the polymerase free to initiate transcription. This mechanism of repressor regulation is called allosteric control. [OH; model structure of lac repressor] 1 In this cartoon, the repressor is represented in its normal tetrameric form, with two of the four units bound. This is actually an accurate representation – what is not shown, is that there are two additional weaker binding sites, 410 bp downstream or 83 bp upstream from the start point. Additional binding at one of the second sites forces the DNA into a loop and results in full repression. Interestingly, the polymerase binding to the repressor-bound DNA is enhanced, but as long as the repressor is present, transcription cannot commence. That has the effect of “storing” the polymerase at the promoter site so that transcription can begin immediately in the presence of an inducer. The next overhead shows the structure of a monomeric unit of the lac repressor. This is a composite, because there is not a structure of a complete repressor in the PDB. All structures in the PDB are truncated at either the oligomerization end or the DNA binding end. N-terminal is DNA binding domain, HTH motif, with hinge domain connecting to two core domains. Between the core domains is the binding site of the inducer. Finally, the Cterminal end contains a helix that serves to mediate the association of the monomeric proteins to a functioning tetramer. [OH; crystal structure of lac repressor] The next overhead shows a ribbon diagram generated from an actual crystal structure of the repressor, with no inducer and with an inducer analogue bound. Note color coded structural features: α-helices in red and β-sheets in yellow. The use of an inducer analogue, rather than a galactoside inducer is necessary in order to make a stable complex. The structure of the inducer analogue is shown on the next overhead: [OH; structure of IPTG; isopropylthioglucose] The compound is referred to by the acronym IPTG, which stands for “isopropylthioglucose”. The DNA-binding domain is missing, so this is a version of the repressor “truncated” at the hinge. What is shown in these structures is the C-terminal helix by which the repressors associate as tetramers: [OH; repeat] 2 the inducer binding site and the hinge-point of attachment of the “headpiece”, which we don’t see but which contains the DNA binding motif at the N-terminus, where the protein has been truncated. The next overhead shows the repressor associated as a tetramer by the C-terminal helices, which illustrates how the association works. [OH; truncated repressors associated as tetramer] The points of truncation are shown on just two of the tetrameric units for clarity. The left hand panel shows the unbound repressor and the right panel shows the protein with the inducer analogues bound. The next overhead shows the same structure in the same orientation (as closely as I could orient it) with two of the four units selected (i.e., two of the units have been removed by computer) to make the truncation points clearer. [OH; 2 units selected to illustrate truncation points] use cursor to draw loop of DNA. I was able to find a structure of a repressor dimer bound to the operator. This dimer is complexed with “anti-inducers” – that is two ligands bound at the inducer site that freeze the repressor into a DNA-binding conformation. The anti-inducer is pictured as an inset – o-nitrophenylfructose, (acronym: ONPF). This structure illustrates the hinge and the DNA-binding region, which consists of two α-helices separated by a turn, which is a common DNA-binding motif called helix-turn-helix or referred to by the acronym HTH motif. (Remember the use of motif. In the figure, there is a small third helical segment just before the N-terminal.) The α-helices fit into the major groove of DNA, where they make contacts with specific bases. Binding of the inducer in the cleft has a very subtle effect on repressor core conformation, but it is likely that mechanism of inducer function causes the hinge to swing to an extent that the HTH DNA-binding region no longer associates snugly with the target sequence. The next overhead shows a comparison of monomeric units of the operatorbound repressor + anti-inducer with C-terminal truncation and the inducer-bound repressor with 3 the N-terminal truncation. I have generated the slide with the repressors as closely as possible in the same orientation and as you can see, it is very hard to see any difference in the two core domains. An important and interesting question is how synthesis of the repressor is controlled? Ultimately, the cascade of control mechanisms cannot be infinite. In fact, there is no control over the lacI gene; however, initiation is relatively inefficient, so that there are never many excess repressor molecules present in the cell, and inactivation of all the repressor present in the cell can be accomplished by low concentrations of β-galactosides. Since the lac genes would be transcribed in the absence of repressor, lac is said to be under negative control. Many prokaryotic genes and virtually all eukaryotic genes are under positive control; i.e., the binding of some ancillary protein at the promoter site is required before transcription can begin. The lac operon also serves as a paradigm for positive control, because a further condition imposed on transcription is that glucose, the preferred source of energy for E. coli be unavailable. If glucose is unavailable, levels of cyclic AMP (cAMP), the 3,5phosphodiester of adenosine: [OH; structure of cAMP] become elevated, and the cAMP complexes with a protein called catabolite activator protein (CAP) or cAMP receptor protein (CRP), which binds 16 bases upstream of the -35 consensus sequence and increases the rate at which transcription is initiated. The CAP protein, in addition to binding DNA, also contacts the RNA polymerase and promotes formation of the closed complex which precedes the formation of an open complex and initiation of transcription. The next slide illustrates the CAP-cAMP-DNA complex. [CAP-cAMP-DNA complex] Good illustration of the mechanics of positive control and initiation CAP is a dimer, associated at an α-helix (frequent motif, as in lac repressor) the DNA-binding domain, HTH motif, is highlighted in blue, and the bound cAMP is in stick form. The next 4 overhead summarizes mechanisms of both positive and negative regulation of gene expression. [OH; Genes, positive and negative gene regulation] While the general picture described for E. coli is applicable to eukaryotic transcription, the eukaryotic process, as expected, is much more complex and differs significantly with respect to details, and we have background now to appreciate the contrast. As I have already indicated in describing termination, there are three eukaryotic RNA polymerases: RNA I polymerase I synthesizes rRNA; RNA polymerase III synthesizes tRNA and RNA polymerase II synthesizes hnRNA, which is the transcript from which the mature mRNA is derived. Different types of promoter are involved in initiation by the different polymerases and in contrast to prokaryotes, the polymerases themselves are not directly involved in the initial recognition event, but depend on the prior assembly of proteins called transcription factors at the promoter sites. The following overhead gives some idea of what is involved. [OH; assembly of initiation complex for pol II] Since our interest lies with mRNA pol II, and because of time, we will outline initiation process for RNA pol II. The significance of TATA and the sequence at the start point we will mention after the next overhead. A more detailed description this process is given in Ch. 25 of Genes IX. Polymerase II has more than 10 subunits, which are summarized on the next overhead. [OH; RNA pol II subunits] The two largest subunits correspond in function to the β and β´ units the prokaryotic polymerase, which bind DNA and make up the catalytic core. At its C-terminal domain (acronym CTD), the β subunit contains the consensus sequence of seven amino acids YSPTSPS (single letter code: Y= Tyr, S=Ser, T=Thr, P=Pro) which may be present in up to 50 repeats in mammals. The presence of the repeats is critical – deletion of half is a lethal mutation. Phosphorylation of the serine (S) and threonine (T) residues is involved in the initiation reaction. Phosphorylation is a frequent mechanism that acts as a switch or trigger via change in conformation, either to initiate or 5 terminate activity. Eukaryotic promoters generally contain a consensus sequence similar to the 10 sequence of prokaryotes, called, in eukaryotes, the TATA box, because of the consensus: ATATAA, which is located -25 bp from the start point. The TATA box appears to function to locate the transcription complex correctly with respect to the start point. The TATA sequence, the exact composition of the box, is more variable in eukaryotes than the -10 consensus sequence in prokaryotes, but this element is always present in eukaryotic promoters and its position relative to the start point is relatively closely conserved. Eukaryotic start points have the general form Py2CAPy5 with A at the +1 position. The TATA box is recognized by a complex of proteins called transcription factor IID (acronym TFIID). Key in the TFIID complex is a small protein called the TATA-binding protein, usually referred to by the acronym TBP. The remaining components of TFIID are referred to in aggregate as the TBP-associated factors, or TAFs. The assembly procedure is complex, and involves numerous factors, as illustrated in the overhead, and variation in the TAFs confers specificity of the complex for different promoters. The next overhead shows a crystal structure of the TATA-binding protein complexed to ~1 turn of DNA. The panel on the bottom is a stick representation of a DNA strand showing that the consensus sequence is really there (its always interesting to check to see if the published crystal structure obeys the rules). [OH; structure of TBP] The take-home lesson is that polymerase II does not become involved until considerable protein scaffolding has already been constructed. Finally, the consensus repeats in the CTD of the unit are phosphorylated, initiating transcription. [OH; detail of the initiation by pol II] In addition to being involved in initiation of transcription, the phosphorylated CTD also appears to be an anchor for enzymes involved in capping, splicing and other hnRNA-processing enzymes. As the polymerase leaves the promoter region, most of the transcription factors dissociate from the complex. There is a reason for being generally aware of the functions and 6 number of transcription factors involved in initiation of hnRNA transcription – genes coding for several of the transcription factors turn out to be oncogenes, and as we mentioned at the outset of the course, one of the formidable challenges in chemical carcinogenesis is to correlate chemically induced changes in structure and function that transform proto-oncogenes into oncogenes. In addition to the TATA box, other elements in the eukaryotic promoter have been identified that modulate efficiency of transcription. In the β-globin gene in the top panel of the next overhead, as an example, the elements are the CAAT box, centered at -75, and the GC box at -90 from the start point. The consensus sequences were identified by saturation mutation experiments (as for the ARS sequence in eukaryotic replication origins). In general, these elements function in either orientation, they are located upstream from the start point, but location with respect to the start point is highly variable and they may also occur in varying multiplicities and combinations, with not all elements required to be present. The CAAT box seems to control efficiency of transcription, but not specificity. Both specificity and efficiency seem to be conferred by up-stream elements recognized by a set of factors called activators. These elements are generally upstream from the promoter. One such consensus sequence is the octamer. [OH; diagram of promoter elements] The bottom panel shows examples of the combinations of elements that may be present in a promoter: TATA box, two CAAT boxes; or a TATA box, two GC boxes and a CAAT box. Notice that TATA is always present, and the distance between TATA and the start point is constant. These elements all bind various factors, which become part of the assemblage of proteins that covers the promoter prior to initiation. 7