Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
http://www.unc.edu/courses/2009spring/envr/740/001 slide 60 Begin 02/05/09 Eukaryotic mRNA is more stable than prokaryotic mRNA, and the half-life of any specific transcript seems to be a function of the presence of certain destabilizing elements. In some instances, physical structures such as stem loops seem to function as the control elements. The stem loops can form when complementary sequences are present on a strand, as the next slide shows: [OH; formation of stem loop] This type of control of gene expression comes under the heading of post-transcriptional control. mRNA stability is clearly another potential problem point, since mutations that make mRNA too stable can result in a over-expression of the coded protein, while highly unstable mRNA may result in insufficient quantities of a critical protein. TRANSCRIPTIONAL CONTROL Implicit in our discussion of the transcription process and the need to establish a reading frame for mRNA, is the idea that most transcription, like replication, does not occur randomly either in time or location on DNA. Therefore, two critical features of transcription are recognition of the appropriate starting point for transcription by RNA polymerases, and modulation of the frequency of the initiation event. The concept of control of transcription is intuitively related to malignant transformation, because among various pathways leading to transformation are the over- or under-production of critical proteins. For prokaryotes, a single polymerase is responsible for all RNA synthesis. The complete polymerase, or holoenzyme, is a complex of four types of subunit described on the next overhead. [OH, table of prokaryotic polymerase components] 1 Function 2 x (40 kD) enzyme assembly, promoter recognition (155 kD) catalytic center (160 kD) catalytic center (32-90 kD) promoter specificity Catalytic unit Subunit (molecular weight) The α subunits are responsible for binding the components of the polymerase together, and also have a role in promoter recognition. The combined β, β′ subunits have the catalytic capability. The σ subunit confers recognition of specific promoter sites on DNA. As we shall see, promoters are binding sites that help to locate transcription start points. There is an all-purpose -factor which serves for recognition of a variety of promoters, which has a molecular weight of 70 kD. However, there are alternative -factors which are involved in regulating expression of a few specific proteins. Without the he -factor, the RNA polymerase binds randomly to DNA, and so initiates transcription very inefficiently. The polymerase·-factor complex has a high affinity for the promoter and so transcription is efficient. The events leading to transcription are diagrammed on the next slide. [OH, steps preceding transcription] The polymerase·-factor complex first binds to the double stranded DNA in a loose complex, which is closed, meaning that the helix remains intact. The complex then converts to a tight complex and the polymerase·-factor melts the DNA (i.e., the strands are separated), giving an open complex. This is followed by initiation of RNA synthesis. Initiation of RNA synthesis is subject to premature release of the nascent RNA and may be repeated a number of times until a sequence > 9 nucleotides has successfully been added, so there are often a number of abortive starts. Once initiation has proceeded beyond this point, the -factor is released and elongation of the RNA proceeds. Promoters were detected and characterized by a technique called footprinting. In this process, the 2 polymerase/duplex DNA complex (without the -factor), in which one strand of DNA radiolabeled at the 5 end with 32P-phosphate, is subjected to partial hydrolysis, so that every accessible phosphodiester linkage is cleaved. The hydrolysis is allowed to proceed only to the extent that an average of ~ one cleavage/DNA complex would occur. This results in a series of bands when the reaction mixture is separated by gel chromatography and visualized by autoradiography. The pattern of bands produced in this manner can be compared with the pattern generated from the same treatment of uncomplexed DNA, which has one band corresponding to hydrolysis of every phosphodiester bond and gives a pattern resembling a “ladder”. The results are displayed and compared on the next overhead: [OH, footprinting gel] The left panel shows where the labeled strand of the duplex∙polymerase complex can be cut, and the resulting sequences of different lengths and therefore different migration times on a gel. The DNA that is complexed with the polymerase is protected in the region of complex formation and therefore bands corresponding to strand breaks in the complexed region are missing. In this manner, it has been determined that a region of ~60 bp is protected. To put this recognition region into some physical context, the following convention has been applied. Using the initially transcribed RNA base as a reference point, called the start point, numbers are assigned to the DNA bases on the coding strand. The start point is assigned the number +1. Nucleotides in the 5 direction are defined to be upstream of the start point and are numbered consecutively with negative numbers (increasing in absolute value; i.e., |-1|, |-2|, |-3| etc., while nucleosides in the 3 direction are defined as downstream, and assigned positive numbers of increasing value. Thus it has been determined by footprinting that the polymerase covers the +20 to -35 region of DNA. While molecular biologists initially expected that the entire protected area would be homologous in a large number of promoters, sequencing revealed that there was almost no homology between the protected regions, except for two hexameric regions, one centered at -10 bp upstream from the start point and a second centered at -35 bp upstream. Both these sequences are called consensus sequences- that is they all correspond to a 3 particular composition with the exception of 1 or at most 2 bases in the sequence. The -10 sequence has the composition: [OH; cartoon of consensus regions of prokaryotic promoter] T80 A95 T45 A60 A50 T96 (the bases occur at the locations indicated with the subscripted probability, remember that with 4 bases, random occurrence would be 25%) The -35 sequence has the composition: T82 T84G78A65C54A45 By investigating the effects of mutations within the consensus sequences, it appears that the -35 sequence is a site of initial recognition and formation of a loose complex, while the -10 sequence is the site of formation of a tight complex, which leads to the open complex and initiation of transcription. The higher content of AT bases in the -10 sequence is consistent with this description. An additional feature that appeared to be conserved among the bacterial promoters was the 16-19 bp separation between the -10 and -35 consensus sequences. The significance of the conserved distance between the two consensus sequences is that it is the dimension corresponding to the distance between the regions of the polymerase that are responsible for establishing the required contacts with DNA. Another characteristic of the polymerase•DNA complex is that the contact points between the polymerase and DNA are primarily on one face of the double helix – meaning that one strand, which is the sense strand, or coding strand, has most of the contact points. The next overhead shows a phage polymerase complexed to a promoter region extending from -1 to -17. A phage is the prokaryotic equivalent of a virus, so its genome organization is not necessarily like that of bacteria. Beyond the fact that the region where the double helix has started melting, the promoter for the phage polymerase does not have any homology to the prokaryotic promoters we have just been describing. However, it is the only actual RNA polymerase•promoter complex structure I could find and it illustrates the manner in which the polymerase grips the DNA. This structure will also be helpful because the 4 organization of the polymerase like a hand with a “thumb”, “fingers” and “palm” is conserved by all polymerases, both DNA and RNA, prokaryotic and eukaryotic, including the specialized eukaryotic DNA by-pass polymerases where the structural alterations which allow the specialized functions of the polymerase are described in these terms. Just as initiation is a critical event in transcription, accurate termination is also necessary. In prokaryotes, two classes of termination features have been characterized. The first depends only on the sequence of DNA at the point of termination and are called intrinsic terminators. The RNA synthesized in the terminator region is palindromic- it can form a hairpin loop [OH; intrinsic terminator] At the base of the hairpin there is a run of 6 Us at the 3 end. It is hypothesized that the formation of a loop by the newly synthesized RNA causes the polymerase to stall and during this time, the UdA pairings in the RNADNA hybrid region, which represent the weakest of possible pairings, allows the RNA strand to dissociate. It is important to note that the secondary structure of the synthesized RNA, and not the DNA is responsible for termination action at the intrinsic termination sites. The next slide shows some examples of intrinsic prokaryotic terminators. This table illustrates the importance of thermodynamics in understanding mechanism, as the terminators differ in efficiency, and this is reflected in the strength of duplex formation as measured by ∆G, so the most stable loops (∆G < 20 kcal/mole) are the most efficient . [OH with examples of intrinsic prokaryotic terminators, efficiencies and ∆G of formation of duplex region of hairpin] The second type of terminator requires a protein called a rho factor, as well as energy input in the form of ATP and is called a rho-dependent terminator. In vitro, in the absence of polymerase, the rho factor has a helicase activity associated with ATP consumption. This feature is presumably important in unwinding the hybrid RNADNA duplex to release the RNA. The rho-dependent terminators have sequences 50-90 bases in length upstream of the terminator 5 which are rich in C and poor in G (only 14%) in the coding strand, as shown on the overhead. This feature is necessary, but the mechanism involved in termination is unknown. The current explanation of the functioning of rho-dependent termination is that rho complexes with RNA at a specific recognition site near the terminator and moves along the strand using ATP and also moving faster than the polymerase. The polymerase slows down in the termination region, rho overtakes it and releases the RNA from the transcription bubble, as is illustrated in the overhead: [OH; rho-dependent termination] The cause of the slowing of the polymerase is not definitively explained, but it may result from the presence of CdG pairs in the upstream sequence. Remember, the CG pair is more strongly associated than the A d T or UdA pair. (In support of this explanation, the efficiency of termination appears to correlate with the length of the C-rich upstream sequences.) At this point, we will just mention highlights of eukaryotic initiation and termination for the sake of symmetry, and return to initiation eukaryotic transcription later. For now, I will say that there are three eukaryotic RNA polymerases: Pol I for rRNA, Pol II for mRNA and Pol III for tRNA, and different mechanisms appear to be involved for each. The eukaryotic RNA polymerases are not to be confused with the prokaryotic DNA polymerases which are also named by roman numerals. Termination in eukaryotes is more complex and not well defined. In the case of Pol I, rRNA is cleaved at a discrete site ~ 1 kB downstream from the 3' end of the mature rRNA. For Pol III, termination seems to work like the prokaryotic intrinsic terminator. For Pol II the termination, which is most interesting from our point of view since it involves mRNA, the event is even less clearly defined, but appears to be dependent on recognition of a signal with the sequence AAUAA in the RNA by a complex containing an RNA endonuclease which generates the 3 end by cleavage. This event is followed by the polyadenylation. Meanwhile, Pol II can continue the transcription, and it is not known what events result in dissociation of Pol II. REGULATION OF TRANSCRIPTION The paradigm for description of both initiation and regulation of transcription has been and still 6 is the lac operon of E. coli. An operon as well as some other important terms relating to gene expression is defined in the left-hand panel on the next overhead. (1) An operon is the coding region of structural genes plus the elements that control their expression. [OH; trans-acting regulators] (2) Genes are the elements of DNA that code for diffusible products, whether the products are directly used by the cell, for example rRNA and tRNA or whether, they are intermediates in other processes, like mRNA, so diffusible means exactly what it implies: the product acts only after diffusing from one site in the cell to another. (3) Trans-acting elements code for diffusible regulatory products that act at sites distant from the sites of transcription. (4) Control elements that act only on coding sequences directly downstream are said to be cis-acting. Cis-acting sites would not code for proteins, but would be binding sites for factors (trans-acting products) that mediate expression of DNA immediately downstream of the cis-acting site. (5) Structural genes are any genes that code for proteins. (6) Regulator genes code for products that are involved in regulating the expression of other genes. The generalization can be made that regulatory proteins are trans-acting factors that recognize and bind to cis-acting elements. The recognition and binding of regulator gene products results in the activation or repression of the regulated regions. This is illustrated in the right-hand panel of the slide: Top panel: default mode is transcription. In the next panel, binding of a protein called a repressor at a site called an “operator” blocks transcription, and the gene is said to be “repressed’. In this situation the gene is defined to be under negative control. Bottom panel: default mode is no transcription. In the next panel, binding of a protein called a transcription factor in the promoter region is necessary before the polymerase associates with the promoter and initiates transcription. This situation is described as positive control. Two important conventions in citing the names of genes and their products are: (1) the genes themselves are written in italicized lower-case, and (2) the proteins that are the gene products are written in normal text with the first letter in upper case and sometimes the entire name in upper case. The lac operon is shown on the next slide: 7 [OH; lac operon] The lac genes are structural, and code for proteins that allow E. coli to utilize β-galactosides as energy sources. The next overhead is an example of a specific galactoside – lactose, a disaccharide comprised of galactose and fructose: [OH; galactoside structure from above] β-Galactosides are galactose derivatives coupled through a β-glycosidic linkage. The lacZ gene product catalyzes the hydrolysis of the glycosidic linkage, to give the sugar monomers. LacY is a membrane-bound protein that transports the β-galactosides from the external environment through the plasma membrane into the cell and LacA is an acetylase, which transfers an acetyl group CH3(C=O)- to β-galactosides. The function of LacA does not seem to be a crucial for the utilization of β-galactosides. 8