* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Central Dogma of Molecular Biology: How does the sequence of a
List of types of proteins wikipedia , lookup
Transcription factor wikipedia , lookup
Molecular cloning wikipedia , lookup
Gene regulatory network wikipedia , lookup
Community fingerprinting wikipedia , lookup
Genetic code wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Messenger RNA wikipedia , lookup
RNA interference wikipedia , lookup
Molecular evolution wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Biosynthesis wikipedia , lookup
Non-coding DNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Polyadenylation wikipedia , lookup
Promoter (genetics) wikipedia , lookup
RNA silencing wikipedia , lookup
Epitranscriptome wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Deoxyribozyme wikipedia , lookup
RNA polymerase II holoenzyme wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene expression wikipedia , lookup
Non-coding RNA wikipedia , lookup
Biochemistry 401G Lecture 36 ANDRES Lecture Summary: Introduce RNA polymers. Know the chemistry of these macromolecules and why different classes are present. Define a Gene. Understand the nomenclature and the definition of a promoter (why are they necessary?). Understand the RNA polymerase reaction. How is it similar/different than DNA polmerases? What is Sigma factor? Know the consensus promoter sequence. Understand the process of transcription. How is this process terminated in prokaryotes? Central Dogma of Molecular Biology: How does the sequence of a strand of DNA correspond to the amino acid sequence of a protein? This concept is explained by the central dogma of molecular biology, which states that: Flow of genetic information in normal cells: Transcription Translation (--<-->---) -DNA ---------------------> RNA -------------------> Protein Replication Why would an organism want to have an intermediate between DNA and the protein it encodes? •DNA can stay pristine and protected, away from the caustic chemistry of the cytoplasm. •Genetic information can be amplified by having many copies of RNA made from one copy of DNA. •Regulation of gene expression can be effected by having specific controls at each element of the pathway between DNA and proteins. The more elements there are in the pathway, the more opportunities there are to control the process in response to different circumstances. Transcription: Information stored in the sequence of DNA is converted to RNA. Mechanistically, transcription is similar to DNA replication: uses nucleotide triphosphates and template directed synthesis in 5' to 3' direction. 2 major differences: 1) Only one DNA template is transcribed (single stranded RNA chain is synthesized). 2) Only a small fraction of the total genetic potential of an organism is used in any one cell. The reaction is thermodynamically favorable: Hydrolysis of the terminal phosphoanhydride bond of nucleotide triphosphate yields 13 kJ/mol more energy than is necessary for formation of a phosphodiester linkage within the RNA backbone (remember back to our discussion of DNA synthesis in earlier lectures). Structural features of RNA: 1. Similar to DNA except it contains a 2' hydroxyl group (makes phosphodiester bond more labile than DNA). 2. Thymine in DNA is replaced by Uracil in RNA. 3. RNA's can adopt regular three-dimensional structures that allow them to function in the process of genetic expression (i.e. the production of proteins). This ability to adopt defined three-dimensional structures that impart functionality places RNA in a unique class- somewhat akin to proteins, and different from DNA. For example certain RNA molecules, when folded, exhibit catalytic capacities (e.g. the cleavage of RNA molecules). The majority of RNA in cells is found in complex with proteins. The most common example is ribosomes (involved in protein synthesis). Classes of RNA: 1) Messenger RNA (mRNA): It is the carrier of genetic information on the primary structure of proteins from DNA, along with special features that allow it to attach to ribosomes and function in protein synthesis. Its size depends on the size of the protein for which it codes. It tends to be relatively short-lived, and its lifetime varies from molecular species to molecular species (depending to a great extent on the biological role of the protein which it encodes). 3% of total RNA in bacteria is mRNA. 2) Ribosomal RNA (rRNA): Forms the ribosome, the site of protein synthesis, and one rRNA is the catalyst for formation of the peptide bond. Various species range in size from 4700 bases to about 120 bases. Eukaryotic and prokaryotic rRNAs are distinctly different. rRNA is long-lived (stable). 83% of bacterial total RNA. 3) Transfer RNA (tRNA): Is a small (65-110 nucleotide) molecule designed to carry activated amino acids to the site of protein synthesis, the ribosome. Is long-lived (stable). 14% of total bacterial RNA. Define a new term: GENE A Gene is the entire nucleic acid sequence that is necessary for the synthesis of a functional RNA molecule (this includes mRNAs that would allow the production of active polypeptides). Gene's can be transcribed into any of the classes of RNA that we discussed above. Thus, a gene contains additional sequence information beyond that which codes for the amino acids in a protein or the nucleotides in an RNA molecule. The gene also contains the DNA necessary to get a particular transcript made. Terminology and numbering of Gene Sequences: 1). DNA is indicated in a 5' to 3' direction along its top (or coding) strand and 3' to 5' along the bottom (TEMPLATE or noncoding) strand. 5'---------------------------------------3' Coding strand 3'---------------------------------------5' If this DNA sequence is capable of being transcribed to RNA, the sequence would be termed a "gene" and the RNA would be written as the 5' to 3' TOP or CODING strand sequence. Coding Strand: Identical to the RNA transcript. Template Strand: Serves as the template for making the RNA transcript and is complementary to that of the RNA transcript. 2). Numbering system Transcription Start Site. Nucleotide in DNA coding strand corresponding to the first nucleotide of the transcribed RNA is numbered +1. Nucleotides to the right of the start site (+1) toward 3' end on coding strand are indicated by increasing positive numbers (+ 2,3,4,5,etc.). Nucleotide directly to the left of the +1 nucleotide (start site) is defined as -1, and the next is -2, -3, etc. There is no zero between -1 and +1. 3). Promoter Sequences: Each gene has sequences that are important for controlling its expression. These are termed "promoter sequences." Usually found at the 5' end of the gene, relative to the coding strand. In the numbering system, these promoter sequences have negative numbers. Enzymology of RNA Synthesis/Transcription: RNA POLYMERASE RNA polymerase can initiate the synthesis of a new nucleic acid strand given a template. This means that a primer is not necessary! A single RNA polymerase functions in bacteria. In eukaryotes, three distinct RNA polmerases are responsible for the synthesis of each class of RNA. DNA and RNA polymerases Catalyze Similar Reactions: Vmax DNA pol III 500-1000 nucleotides/sec Vmax RNA polymerase 50 nucleotides/sec 10 molecules of DNA polymerase/cell, 3000 molecules of RNA polymerase (~50% involved in making RNA at any one time). DNA replication is fast but initiates at a few sites while RNA transcription is slow but occurs at many sites of initiation and so accumulates to high levels. RNA polymerase is highly processive (like DNA pol.). So once initiated, it will not dissociate until a specific termination signal is received. Another difference is that RNA polymerase is much less accurate. RNA Polymerase is an Oligomeric Protein: 5 separate protein subunits comprise RNA Polymerase in bacteria: 2 copies of α, β, β', σ, and ω. A separate function has been ascribed to different subunits: 2 alpha -- initiation. beta -- phosphodiester bond formation. beta' -- binds the DNA template. These four subunits are the core enzyme; they alone carry out transcription, but cannot initiate rapidly at specific sites. 4. sigma -- recognizes the promoter and provides binding specificity. The core enzyme plus sigma factor is called the holoenzyme. 5. omega -- unknown function. The sigma subunit (σ) can be removed from the RNA polymerase core while leaving the rest of the complex intact. Using these two complexes, scientists tested the binding affinity of the entire complex and the Core complex (lacking sigma) for general DNA and "Promoter" DNA (which contains -10 and -35 consensus sequences, see below). Kassoc. Values for: Any DNA Promoter DNA Sequence RNA polymerase (- sigma) 1 x 1010 M -1 1 x 1010 M -1 RNA polymerase (+ sigma) 5 x 106 M -1 2 x 1011 M -1 Sigma Factor does two things: 1). Decreases the affinity of RNA polymerase for general DNA (by 4 orders of magnitude). 2). Increases the affinity of RNA polymerase for promoter DNA sites (by 1 order of magnitude). The function of sigma is to interact with the -10 and -35 consensus sequences (promoter region) so that RNA polymerase can bind to (find genes by finding their promoter regions), and initiate RNA synthesis from, genes. STEPS OF TRANSCRIPTION: 1). Binding of RNA polymerase to Promoter Sequences: In E. coil there are two regions that are similar in all promoters. One sequence is centered at -10 and the other -35 relative to the transcriptional start site at +1. The -10 and -35 sequence is used to identify the location of genes. Called "Consensus Sequences". A consensus sequence is an idealized sequence of bases Whose real counterparts appear in various places in a polynucleotide and perform the same function in each, but with minor deviations of the real sequence from the ideal. For the -10 region (or Pribnow box) the consensus sequence is: 5' TATAAT 3', often called the "TATA" box for this reason. For the -35 region the consensus sequence is 5' TTGACA 3'. The nucleotide at the transcriptional start site is almost ALWAYS A PURINE (A or G), most often an Adenine. Promoter recognition is a critical step in transcription. This is because promoter recognition is the rate-limiting step in transcription. Because the same protein complex in bacteria transcribes all genes, differences in promoter structure are largely responsible for differences in the frequency of initiation (as rapid as 1/10 sec to 1/per generation [30-60 min]). The notion of consensus sequence represents relative (as opposed to absolute) specificity for a nucleotide sequence. The more closely a real promoter (-10, -35 region) resembles the consensus, the better it performs as a promoter (more often recognized by the sigma factor containing complex). Therefore, in prokaryotes the more closely the promoter region for a given gene resembles the perfect consensus sequence; the more often the gene will be transcribed. How does the RNA polymerase find a promoter in DNA? RNA polymerase binds to DNA at random sites and moves quickly along the DNA while the sigma factor scans for promoter regions. Once a promoter is located, the sigma subunit binds to the promoter sequences with high affinity and prevents the polymerase from scanning any further. Why use a scanning mechanism? Because it is much faster than a random association/dissociation search which is diffusion controlled and therefore a second-order reaction (Maximum rate 108 M-1 S-1 The scanning scheme is essentially first order and has a rate constant of 1010 M-1 S-1,. This is two orders of magnitude faster than a bind/release search. 2). Initiation of Transcription. A). RNA polymerase associates with promoter sequences near the +1 Transcription start site. This is called a "CLOSED PROMOTER COMPLEX" because the DNA at the Transcription start site is still double stranded. B). The RNA Polymerase complex then unwinds the DNA at the Transcription start site to make it single-stranded. This complex is termed the "Open Complex" because the DNA is single stranded within the RNA polymerase active site. 17 base-pairs of DNA are unwound, forming a "Transcription Bubble". RNA polymerase now starts to synthesize the RNA transcript. RNA polymerase has two binding sites for ribonucleoside triphosphates, the FIRST is used during elongation and binds all 4 common ribonucleoside triphosphates with a half saturating concentration of 10 µM. The SECOND, used only during initiation, binds ATP and GTP preferentially at 100 µM. Thus, most RNA molecules have a purine at their 5' end. The binding of a purine at this site is a critical difference between DNA and RNA polymerases. The binding to an initiating nucleotide allows the RNA polymerase complex to begin chain synthesis without a primer. Chain growth begins with binding of the template specified rNTP at the initiation site, followed by binding of the next nucleotide at the elongation site. Next, nucleophilic attack by the 3' hydroxyl of the first nucleotide on the α (inner) phosphorus of the second nucleotide generates the first phosphodiester bond and leaves an intact triphosphate at the 5' position of the first nucleotide. RNA polymerase moves in 5' to 3' direction (relative to the coding strand) and continues synthesizing RNA off the DNA template strand. "Transcription Bubble" moves down the DNA helix in concert with the new synthesis. Within the "Bubble" only 12 nucleotides of the DNA template strand are base-paired with the RNA strand at any time. This is called the "RNA:DNA hybrid". As each new ribonucleotide is incorporated, one base-pair of the RNA:DNA hybrid at the other end of the transcription bubble has to dissociate. 3). Termination of Transcription. RNA transcripts are not infinitely long. There are two ways in which termination of transcription is known to occur in prokaryotes. First lets talk about pausing: RNA polymerase can pause during transcription. Pausing occurs at sequences rich in G/C base-pairs. This is because it is difficult to disrupt stable G/C base-pairs to allow formation of the transcription bubble and to release the RNA:DNA hybrid. Pausing can last from 10 seconds to 30 minutes. Two Major Mechanisms of Transcription Termination. Simple and Rho-dependent. 1). SIMPLE (Rho-independent): Some termination sites have two shared structural features at these termination sites: A). Two symmetrical G/C-rich sequences that in the transcript have the potential to form a stem-loop structure. B). A downstream run of four to eight A residues. RNA polymerase pauses at the first G/C rich region, this allows the second G/C rich region of the RNA transcript to base-pair with the first region- forming a RNA:RNA stem-loop duplex and eliminating some of the base-pairing between the DNA template and the RNA transcript. Further weakening, leading to dissociation occurs when the A-rich region is transcribed to give a series of very weak A-U bonds. 2). Rho Mediated: Factor-dependent termination is more rare. The Rho protein is necessary for the termination of these genes. 3 Steps: 1). RNA Polymerase complex pauses. 2). Rho protein recognizes and binds to a specific RNA sequence in the nascent RNA transcript. 3). The Rho protein terminates transcription in an ATP dependent process by migrating toward the 3' end of the RNA transcript, displacing the RNA polymerase and disrupting the RNA:DNA hybrid. Differences in RNA transcription between eukaryotes and prokaryotes: 1). There is only one RNA polymerase in E. coli. There are three RNA polymerases in eukaryotes. 2). In eukaryotes, most promoters direct transcription of only one gene. In bacteria, several genes are often transcribed from a single promoter. As we will discuss, this type of transcriptional unit is called an "Operon". Gene A Gene B Gene C 5'----------[--------]-----[---------------]----------[------------------------]----------3' 3'----------[--------]-----[---------------]----------[------------------------]----------5' 3). Eukaryotic RNA polymerases require additional protein factors (Transcription Factors) to bind to a promoter and initiate transcription. We will discuss these proteins when we discuss eukaryotic gene expression. 4). Eukaryotic RNA polymerases must pass through nucleosomes (that are found on all chromatin) during transcription. 5). Eukaryotic RNA polymerases do not have terminator signals, rather they proceed well past the coding region and into the 3' noncoding region of genes. The action of additional enzymes processes the RNA molecule extensively in a series of reactions that we will discuss (capping, splicing, editing).