* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download genetic code
Transfer RNA wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Transcription factor wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Frameshift mutation wikipedia , lookup
DNA vaccination wikipedia , lookup
Molecular cloning wikipedia , lookup
Human genome wikipedia , lookup
Expanded genetic code wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Epigenomics wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Microevolution wikipedia , lookup
DNA polymerase wikipedia , lookup
DNA supercoil wikipedia , lookup
RNA interference wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
History of genetic engineering wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Point mutation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Messenger RNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genetic code wikipedia , lookup
Polyadenylation wikipedia , lookup
RNA silencing wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
History of RNA biology wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Epitranscriptome wikipedia , lookup
بسم هللا الرحمن الرحیم تهیه کننده :علی قنبری • دانشجوی کارشناسی ارشد بیوتکنولوژی کشاورزی • استاد راهنما :جناب آقای دکتر باباییان • Ghanbari March 2006 Genetic Code DNA transcription mRNA translation protein genetic code: means for converting DNA sequence into protein sequence the original question has always been how to convert 4 nucleotide bases into 20 types of amino acids in the 1940's Beadle and Tatum begain studying a bread mold Neurospora and isolated mutants (ie. strains of yeast with damaged genes) that could not grow when provided with minimal nutrients but survived OK when complete, or rich nutrients were provided. Beadle and Tatum identified many mutants for various products-- amino acids, vitamins, etc. March 2006 Ghanbari Genetic Code They already knew that there were multiple steps in synthesizing a particular product; ie. perhaps 5 genes for synthesizing isoleucine By substituting different intermediates into minimal growth conditions, they could infer which steps (enzymes) were defective A W B X C Y D Z E Beadle and Tatum isolated various strains that required E for growth By adding A, B, C, or D to minimal media they could guess which step was defective. for example, if one strain would grow when C, D, or E were added to minimal media but NOT A or B, that means that enzyme Y can convert C toMarch D and enzyme Z can convert D to E. However, B cannot be made 2006 Ghanbari into C, saying that the defect was in enzyme X. Genetic Code One gene, one enzyme hypothesis: each gene that they mutated coded for exactly one single enzyme so now there is a connection between mutations and enzyme function We now know this is slightly off-- one gene codes for 1 protein some enzymes have 2 or more proteins in them (ie. F-type ATPases, etc) We also know it goes even further-- one gene codes for 1 protein and they do not have to have an enzymatic function-- ie. actin, hemoglobin, etc. None of these experiments, though, adressed the question of HOW a gene codes for a protein (note that at this point in 1940's, DNA was just determined to be the genetic material) March 2006 Ghanbari Genetic Code 4 nucleotides in DNA have to somehow code for 20 amino acids 1 nucleotide clearly not sufficient-- that gives on 4 amino acids 2 nucleotides is better, but not enough-- 42 gives 16 amino acids 3 nucleotides is the minimum-- 43 gives 64 possible amino acids, enough early 1960's, Crick, Brenner and students used acridine dyes to generate mutants defective for various enzymes acridine dyes are a mutagen (chemical that causes mutations) that cause addition or deletions of single base pairs of DNA additional acridine dye treatments could sometimes return enzyme function-- changes were additive-- multiple changes gave active enzyme frameshift mutation: change in DNA sequence which alters the nucleotide March 2006 Ghanbari 'letters' making up the amino acid 'words' of a protein Genetic Code Crick and Brenner showed that '+' mutants were cancelled by '-' mutants Two '+' or two '-' mutants did not cancel Three '+' or three '-' mutants WERE able to cancel out each other, just like a '+' and a '-' this suggested a 'triplet' code-- 3 nucleic acids per amino acid '+' frameshift and '-' frameshift nearby gives mostly normal enzymes two '+' or two '-' enzymes could not give a readable message three '+' or '-' mutations near each other would add (+) or remove (-) one amino acid, change a few others, and leave the rest of the protein Crick and Brenner saw these reversions (returns to normal) frequently and they knew there were 64 possible 3 nucleotide codes to make 20 amino Ghanbari acidsMarch 2006 Genetic Code AUG GTC AAT AAA CCG... met val asn lys pro normal protein sequence AUG TGT CAA TAA ACC G... met cys gln OCR one + mutation AUG TTG TCA ATA AAC CG... met phe ser ile asn two (++) mutation AUG TTT GTC AAT AAA CCG... three (+++) mutation met phe val asn lys pro note the sequence similarity March 2006 Ghanbari Genetic Code degenerate code: one amino acid can be coded for by more than one triplet code ie: synonyms: two 'words' meaning same thing Note that these arguments mean that the code is non-overlapping an overlapping code would have nucleotides 1-3 coding for the first amino acid, nucleotides 2-4 coding for the second amino acid, etc. in an overlapping code, the '+' or '-' mutants could only change a few amino acids-- all the others would be unaffected there are a few cases (usually viruses) that have overlapping genes; ie. genes that share different reading frames using the same nucleotides almost always use opposite strands of DNA March 2006 Ghanbari Genetic Code Nirenberg and Matthei: developed biochemical system outside of cells to study protein synthesis in their system, if they added RNA they would see more protein made used an enzyme called polynucleotide phosphorylase to make RNA sequence composed of only 1 type of base, either G, C, A, or U (not T!) UTP pnp poly(U), ATP pnp poly(A) etc with poly(U) added to their cell free system, they saw more phenylalanine incorporated into proteins Reasoned that UUU coded for phenylalanine showed AAA for proline, and GGG for glycine March 2006 coded for lysine, CCC Ghanbari Genetic Code Note that this brings up an issue-- DNA is double stranded ie. GACGTCTAG CTGCAGATC one strand will serve as the template-- strand that is used to direct the synthesis of the RNA ie. if GACGTCTAG is the template DNA, it would direct the synthesis of CUGCAGAUC using the complimentary base pairs A:U and G:C, the same rules as with base pairings within DNA coding strand: DNA strand that is most similar to the synthesized RNA March 2006 Ghanbari Genetic Code codon: 3 letter mRNA triplet 'read' by the protein synthesis machinery bases are always read starting at the 5' phosphate toward the 3' end (same order that nucleotide chains are made in) 5'-AUGUUUCGCAGA-3' mRNA (like the coding strand) 3'-TACAAAGCGTCT-5' DNA template strand H. Gobind Khorana, instead of using polynucleotide phosphorylase, synthesized RNAs with precise sequences arranging various possible orders together, they could identify all codons March 2006 Ghanbari Genetic Code March 2006 Ghanbari Genetic Code several special codons in the genetic code AUG (in DNA represented by ATG) coding for methionine-initiator codon: starts the process of protein synthesis 3 termination codons (UAA, UAG, UGA) or stop codons: 3 base code to end protein synthesis the genetic code is unambiguous-- a 3 base codon always make the same amino acid wobble base: the third base pair in a codon can often be changed without changing the protein sequence Almost all organisms use the exact same code a few exceptions that make a non-standard amino acid or use a special March 2006 Ghanbari transfer RNA (tRNA) that changes the meaning of a codon Transcription in Prokaryotes RNA polymerase: enzyme which synthesizes mRNA from the DNA template strand using G, C, A, and U (uracil) as the bases core enzyme of RNA polymerase is a tetramer with 2 a and 2 b subunits holoenzyme: core RNA polymerase plus the sigma factor s sigma factor recognizes sequences of DNA that precede coding DNA promoter: regulatory sequence of DNA before the coding region of a gene extremely important for regulating what genes are turned on relatively simple in prokaryotes (discussed more in Chapter 23) different sigma factors recognize different promoters allows bacteria to turn on particular genes only when they're needed! March 2006 Ghanbari RNA polymerase (T7 Virus) single stranded DNA double stranded DNA March 2006 Ghanbari RNA polymerase (T7 Virus) NoteMarch the 2006 nice little hole for the single Ghanbaristranded DNA to slide through Transcription in Prokaryotes 4 steps of transcription: Binding, Initiation, Elongation, and Termination transcription unit: segment of DNA that gives rise to a RNA molecule 1) RNA core enzyme (recognizing the s factor bound to a promoter) binds to the DNA at that site binding initiates unwinding of the DNA double helix upstream: DNA 5' of the start of RNA transcription (ie. does NOT get included in the RNA chain-- usually contains the promoter region) downstream: DNA 3' of start of RNA transcription (included in the RNA) promoter binding unwinds 15-18 bp of the DNA near where transcription March 2006 Ghanbari begins DNA footprinting promoters were originally identified by DNA footprinting DNA footprinting: general technique for identifying sites on the DNA that are bound by proteins if a protein is bound to the DNA, a chemical or enzyme that breaks phosphodiester bonds cannot reach the portion of DNA bound to the protein-- that region is protected you then randomly fragment the DNA region (having isolated it earlier along with the protein of interest) and separate the pieces by electrophoresis (remember-- phosphate is negatively charged so will move in an electric field!) regions of DNA with less fragmentation have proteins bound to them March 2006 Ghanbari March 2006 Ghanbari DNA footprinting March 2006 Ghanbari Transcription in Prokaryotes 2) After DNA binds the s factor, RNA polymerase initiates transcription recognizes s factor + the unwound DNA NTPs (ie. ATP, CTP, GTP, or UTP) hydrogen bond to the template strand of the DNA in the first 2 positions RNA polymerase catalyzes formation of a phosphodiester bond between first 2 nucleotides, joining the 3' hydroxyl of the first base to the 5' phosphate of the second base generates a phosphodiester bond and inorganic phosphate (PP) March 2006 Ghanbari Transcription in Prokaryotes RNA polymerase always starts at the 5' end and moves to the 3' ie. new bases are added to the free 3' hydroxyl group of ribose PP is lost from the newly added NTP polymerase moves along, forming phosphodiester bonds as NTPs bind after about 9 bp, s factor detaches from the RNA polymerase-- initiation is complete 3) Elongation: RNA polymerase moves happily along the DNA moves 5' to 3' -- NTPs bind to the 3' OH, giving off PP DNA is unwound as the polymerase moves forward; winds back up after it passes-- RNA doesn't form double helices as well (about 12 bp only) 2006 grows and exists on its Ghanbari RNAMarch strand own Transcription in Prokaryotes 4) Termination: RNA polymerase stops adding bases termination signal: sequence of DNA that makes RNA polymerase halt 2 types of termination signals GC rich followed by several U's GC rich region is complimentary to itself-- forms a hairpin hairpin: nucleic acid structure that can base pair to itself March 2006 Ghanbari Transcription in Prokaryotes rho (r) factor: protein that binds to a specific 50-90 bp sequence of RNA rho binding unwinds RNA from the DNA template, essentially pulling it away from the DNA and causes the RNA and polymerase to 'fall off' the DNA once the RNA polymerase core enzyme falls off DNA, can bind to a new sigma factor and start the process again (and again, and again!) at the same or different promoters note that RNA polymerase is an ENZYME-- it isn't changed by making the phosphodiester bonds March 2006 Ghanbari Transcription in Eukaryotes follows the same 4 stages: binding, initiation, elongation, termination because the organisms are more complex, so is transcription instead of 1 RNA polymerase, there are now 3, each with different characteristics RNA polymerase I (RNApol I) makes ribosomal RNAs (rRNA) RNApol II: synthesizes messenger RNA for protein coding also makes small nuclear RNAs for mRNA processing synthesizes broadest variety of RNAs RNApol III: makes transfer RNA (tRNA) and other short RNAs all 3March are 2006 large multisubunit enzymes (8-10 subunits) homologous to the Ghanbari prokaryotic ones Transcription in Eukaryotes 3 different classes of polymerase-- therefore 3 classes of promoters RNA pol I promoter has 2 parts core promoter: minimal set of DNA bases to start rRNA synthesis works, but is not very efficient upstream control element or upstream activator, is upstream of the core promoter, binds different proteins, and increases transcription RNA pol I DNA March 2006 rRNA Ghanbari Transcription in Eukaryotes RNA pol II promoter is the most complicated (because of the diversity of RNA it needs to make) 1) short Initiator region (Inr) at the transcription start point 2) TATA box (A-T rich region) about 25 bp upstream from the Inr 3) TFIIB recognition element (BRE) immediately upstream from TATA 4) downstream promoter element (DPE): 30 bp downstream from Inr Not every promoter has to have all 4 elements must have either TATA or DPE, but can have both like RNApol I promoters, it has upstream control elements as well March 2006 Ghanbari Transcription in Eukaryotes RNA pol II DPE DNA mRNA TATA box TFIIB (BRE) Initiator (Inr) core promoter (diagrammed above) gives low levels of transcription upstream elements regulate the level even further nearby upstream elements are called proximal promoter elements March 2006 Ghanbari more distant upstream elements are called enhancers or silencers Transcription in Eukaryotes RNApol III promoter is entirely downstream of the transcription start contains two 10 bp sequences, box A and either box B (for tRNA) or box C (for rRNA) RNA pol III DNA box B or C March 2006 tRNA Ghanbari box A Transcription in Eukaryotes transcription factor: protein that regulates the transcription of genes general (basal) transcription factor: protein REQUIRED for transcription often start with TF, like TFIIB just like with the s factor in prokaryotes, proteins must bind promoters next, other TF proteins recognize proteins bound to promoters RNA polymerase recognizes the cluster of TF and DNA binding proteins notice the building up of a machine by protein- protein inteactions! this is called the pre-initiation complex: RNApol is bound, but not making RNA March 2006 Ghanbari Transcription in Eukaryotes once the pre-initiation complex is formed, 2 more TF factors are needed TFIIE binds and causes RNA polymerase to be phosphorylated TFIIH binds the polymerase and acts as a helicase-- unwinds the DNA so that the phosphorylated RNA polymerase can make RNA Elongation is very similar-- RNA polymerase uses A:U, T:A, G:C, C:G base pairing to make the RNA chain from NTPs giving off PP uses the 3' OH from the message to the 5' phosphate of the NTP One additional complication: RNA polymerase has to have proteins that unwind nucleosomes-- ie. bacteria don't have them eukaryotic polymerases have 8-10 proteins, bacteria only 4 some of these subunits recruit proteins to unwind nucleosomes March 2006 Ghanbari Transcription in Eukaryotes Termination is usually caused by recognition of one of several sequences in the DNA-- different polymerases recognize different termination sequences ie. RNApol I stops when a protein binds a particular 18 bp sequence in the RNA RNApol III stops when it encounters 6-8 uracils, etc unlike prokaryotes, eukaryotic polymerases don't seem to stop at hairpins March 2006 Ghanbari RNA Processing newly made RNA molecule is called a primary transcript-- copied directly from the DNA before it can serve its eventual function, RNA must be processed RNA processing includes being cleaved at specific locations, chemical modification of some nucleotides, nucleotides being added, etc. modifications are usually dependent upon their eventual function ie. transfer RNAs will have different modifications than mRNAs just like in transcription, eukaryotes have more complex RNA processing March 2006 Ghanbari Ribosomal RNA Processing 70-80% of the total RNA in a cell is ribosomal RNA (rRNA) March 2006 Ghanbari Ribosomal RNA Processing 4 different rRNAs distinguished by their sedimentation coefficients (only 3 in eukaryotes) 3 of the rRNAs are made by RNApol I as a single primary transcript RNApol I is active in the nucleolus, the large dense spot in the nucleus transcribed spacers are the parts of the primary transcript which separate the rRNAs genome contains multiple copies of the rRNA primary transcription unit-- needs to March 2006 make a lot of rRNA! Ghanbari Ribosomal RNA Processing transcribed spacers are cut out and then degraded methyl groups are also added to ribose hydroxyls and some bases snoRNAs (small nucleolar RNAs): RNAs that bind to particular complimentary regions of rRNAs and which also bind to proteins that methylate the rRNAs (note the use of complimentary base pairing to direct these modifications!) methylation of the rRNA reduces its degradation-- enzyme active sites don't recognize it because it doesn't have the hydroxyl groups just like ATP provides the phosphate group for phosphorylation reactions, S-adenosyl methionine provides the methyl group Ghanbariacid) and methionine (amino acid) NoteMarch the2006 fusion of adenosine (nucleic Ribosomal RNA Processing nearly HALF of the rRNA primary transcript is transcribed spacer that gets deleted as the rRNA gets processed, it associates with various proteins and eventually becomes the large and small ribosomal subunits Ribosomes are therefore made in the nucleolus ribosomes also include one rRNA transcribed by RNApol III this RNA, like the RNApol I transcript, has multiple copies arrayed in tandem-- many copies in the same direction, one right after the other the RNApol III transcript has few if any modifications March 2006 Ghanbari in genetics, we talk about mechanisms how these tandem arrays formed Transfer RNA Processing like the rRNA, tRNA requires extensive removal, addition and modification of the nucleotides tRNA: RNA molecules that bind to particular amino acids on one end and recognize one of the 61 coding codons on the other these are the ESSENTIAL bridge between nucleic acids and proteins tRNAs are only about 70-90 nucleotides long and have several hairpin loops (with complimentary base pairings holding them together) to form a cloverleaf structure-- in 3D is really more L shaped like the rRNAs, tRNA is synthesized as a precursor or pre-tRNA and processed extensively all tRNAs have the sequence CCA at the 3' end-- some naturally, in March 2006 Ghanbari others it is added later Transfer RNA Processing March 2006 Ghanbari Messenger RNA Processing prokaryotic mRNA needs little or no processing-- it's ready to go Ribosomes can associate with prokaryotic mRNA even as it is being transcribed-- no barrier between mRNA synthesis and translation March 2006 Ghanbari Messenger RNA Processing Eukaryotes require extensive processing of their mRNAs at the 5' end (ie. first synthesized part-- start of the transcript) have a 5' cap, 7-methylguanosine is made 'backwards'-- 5' to 5' linkage to the initial triphospate base added early after transcription aids in stability and positioning of the transcript for translation NOT added by RNA polymerase! March 2006 Ghanbari Messenger RNA Processing At the 3' end, most mRNAs contain a 'polyA tail'-- 50-250 adenosines added by a specific enzyme, polyA polymerase a signal sequence in the mRNA directs first the cleavage and then the addition of the polyA tail 10-35 nucleotides downstream polyA tail helps to protect the mRNA from exonucleases and therefore increases its useful lifespan polyA is recognized by transport proteins to send it out of the nucleus can also be used by researchers in the lab to purify specifically mRNA using a polyT oligonucleotide March 2006 Ghanbari Messenger RNA Processing introns: sequences in the primary mRNA transcript that do not appear in the mature mRNA introns get cut out of the pre-mRNA and the mRNA gets ligated back together exons: regions of the pre-mRNA or DNA sequence that appear in proteins introns are present in most protein coding genes usually found nowadays by comparing mRNA sequence to genomic DNA RNA splicing: enzymatic process of removing the introns spliceosome: RNA protein complex that carries out RNA splicing March 2006 Ghanbari Messenger RNA Processing most introns start with a 5' GU sequence and end with a 3' AG introns must also contain an internal sequence called the branch point snRNPs: small protein+ RNA complexes that make up the spliceosome snRNPs bind in 3 parts: U1 binds to the 5' splice site U2 binds to the branch point U4/U6/U5 brings the ends of the intron together a spliceosome contains 5 RNAs and 50+ proteins-- as big as a ribosome! U4 U1 U2 March 2006 U4 U5 U6 U6 U6 U4 U1 U5 U2 Ghanbari U5 U1 U2 Messenger RNA Processing Once snRNPs form a spliceosome, the 5' end is covalently joined to an adenine nucleotide in the branch point in a structure called a lariat once that intermediate is formed, the 3' end is cleaved, the 2 ends of the exons are joined together, and the lariat RNA is sent for degradation splicing occurs during transcription-- doesn't require pre-processing we had mentioned ribozymes as RNA catalysts first ribozymes were self-splicing intron sequences March 2006 Ghanbari Messenger RNA Processing Some introns are NOT degraded after excision some are involved in rRNA methylation others can regulate mRNA translation by complimentary binding to similar sequences (mRNAs) in other proteins, introns may be left in or taken out alternative splicing: decision to leave in or take out an intron gives one gene the ability to make a number of related proteins using different combinations of potential introns March 2006 Ghanbari Messenger RNA Processing starting with a pre-mRNA alternative splicing could yield: or or or or etc. each alternatively spliced transcript would code for proteins that March 2006 Ghanbari have overlapping regions but may have different functions, locations mRNA Metabolism most mRNAs aren't around for long-- they are degraded very rapidly hated by molecular biologists-- mRNA is hard to work with half-life: average length of time it takes for 1/2 of the mRNAs to be degraded mRNA instability allows the cell to regulate gene expression mRNA also amplifies a DNA sequence one gene can make many mRNAs, each making many proteins allows the cell to control protein levels by controlling how many mRNAs are made from that gene 2006 some March promoters are strong, others Ghanbari weak, others only active sometimes RNA Viruses RNA viruses use RNA as their primary genetic material-- ie. HIV these viruses have a very special enzyme called reverse transcriptase which can make a DNA copy from the RNA The DNA copy can then integrate into the host genome where it then makes RNA that codes for its proteins and its own genetic material for new virus particles Molecular biologists use reverse transcriptase to make 'copy' DNA or cDNA (DNA made from messenger RNA) allows scientists to study exactly what mRNA transcripts get made without having to understand what's happening with all the splicing, regulatory DNA, repetitive DNA, etc in the genome March 2006 Ghanbari RNA Viruses Reverse transcriptase is different than cellular enzymes-- it uses RNA to make DNA because it has a different active site, some nucleotide analogs can be used to inhibit the reverse transcriptase active site (competitive transition state analogs) and make up the most effective anti-AIDS drugs March 2006 Ghanbari