Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
11/4/05 Protein Structure & Function 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 1 Announcements Exam 2 - Has been graded - Will be returned at end of class today Grade statistics – 444 Average = 81/100 544 Average = 100/118 Questions? 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 2 Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses & tentative presentation schedule to students Dec 2 Fri noon - Written project reports due Dec 5,7,8,9 class/lab - Oral Presentations (20') (Dec 15 Thurs = Final Exam) 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 3 Bioinformatics Seminars Nov 4 Fri 12:10 PM BCB Faculty Seminar in E164 Lago How to do sequence alignments on parallel computers Srinivas Aluru, ECprE & Chair, BCB Program http://www.bcb.iastate.edu/courses/BCB691F2005.html Next week: Nov 10 Thurs 3:40 PM ComS Seminar in 223 Atanasoff Computational Epidemiology Armin R. Mikler, Univ. North Texas http://www.cs.iastate.edu/~colloq/#t3 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 4 Bioinformatics Seminars CORRECTION: Week after next - Baker Center/BCB Seminars: (seminar abstracts available at above link) Nov 14 Mon 1:10 PM Doug Brutlag, Stanford Discovering transcription factor binding sites Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 5 RNA Structure & Function/Prediction Protein Structure & Function Mon Review - promoter prediction Wed RNA structure & function RNA structure prediction 2' & 3' structure prediction miRNA & target prediction - Lab 10 Fri - a few more words re: Algorithms Protein structure & function 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 6 Reading Assignment (for Fri/Mon) Mount Bioinformatics • Chp 10 Protein classification & structure prediction http://www.bioinformaticsonline.org/ch/ch10/index.html • pp. 409-491 • Ck Errata: http://www.bioinformaticsonline.org/help/errata2.html Other? That should be plenty… 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 7 Review last lecture: RNA Structure Prediction 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 8 miRNA and RNAi pathways microRNA pathway RNAi pathway MicroRNA primary transcript Exogenous dsRNA, transposon, etc. Drosha precursor Dicer Dicer siRNAs miRNA target mRNA RISC RISC “translational repression” and/or mRNA degradation C Burge 2005 11/04/05 RISC mRNA cleavage, degradation D Dobbs ISU - BCB 444/544X: Protein Structure & Function 9 miRNA Challenges for Computational Biology • Find the genes encoding microRNAs • Predict their regulatory targets Computational Prediction of MicroRNA Genes & Targets • Integrate miRNAs into gene regulatory pathways & networks Need to modify traditional paradigm of "transcriptional control" primarily by protein-DNA interactions to include miRNA regulatory mechanisms! C Burge 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 10 RNA structure prediction strategies Secondary structure prediction 1) Energy minimization (thermodynamics) 2) Comparative sequence analysis (co-variation) 3) Combined experimental & computational 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 11 Secondary structure prediction strategies 1) Energy minimization (thermodynamics) • Algorithm: Dynamic programming to find high probability pairs (also, some genetic algorithms) • Software: Mfold - Zuker Vienna RNA Package - Hofacker RNAstructure - Mathews Sfold - Ding & Lawrence R Knight 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 12 Secondary structure prediction strategies 2) Comparative sequence analysis (co-variation) • Algorithms: Mutual information Stochastic context-free grammars • Software: ConStruct Alifold Pfold FOLDALIGN Dynalign R Knight 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 13 Secondary structure prediction strategies 3) Combined experimental & computational • Experiment: Map single-stranded vs double-stranded regions in folded RNA • How? Enzymes: S1 nuclease, T1 RNase Chemicals: kethoxal, DMS R Knight 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 14 Experimental RNA structure determination? • X-ray crystallography • NMR spectroscopy • Enzymatic/chemical mapping • Molecular genetic analyses 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 15 1) Energy minimization method What are the assumptions? Native tertiary structure or "fold" of an RNA molecule is (one of) its "lowest" free energy configuration(s) Gibbs free energy = G in kcal/mol at 37C = equilibrium stability of structure lower values (negative) are more favorable Is this assumption valid? in vivo? - this may not hold, but we don't really know 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 16 Free energy minimization What are the rules? A A U U Basepair A=U A=U What gives here? G = -1.2 kcal/mole A U U A Basepair A=U U=A G = -1.6 kcal/mole C Staben 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 17 Energy minimization calculations: Base-stacking is critical AA UU -1.2 CG GC -3.0 AU or UA UA AU -1.6 GC CG -4.3 AG, AC, CA, GA UC, UG, GU, CU -2.1 GU UG -0.3 CC GG -4.8 XG, GX YU, UY 0 - Tinocco et al. C Staben 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 18 Nearest-neighbor parameters Most methods for free energy minimization use nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of G at 37C) & most available software packages use the same set of parameters: Mathews, Sabina, Zuker & Turner, 1999 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 19 Energy minimization - calculations: Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for: • helical stacking (sequence dependent) • loop initiation • unpaired stacking (favorable "increments" are < 0) Fig 6.3 Baxevanis & Ouellette 2005 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 20 But how many possible conformations for a single RNA molecule? Huge number: Zuker estimates (1.8)N possible secondary structures for a sequence of N nucleotides for 100 nts (small RNA…) = 3 X 1025 structures! Solution? Not exhaustive enumeration… Dynamic programming O(N3) in time O(N2) in space/storage iff pseudoknots excluded, otherwise: O(N6 ), time O(N4 ), space 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 21 Algorithms based on energy minimization For outline of algorithm used in Mfold, including description of dynamic programming recursion, please visit Michael Zuker's lecture: http://www.bioinfo.rpi.edu/~zukerm/lectures/RNAfold-html From this site, you may also download his lecture as either PDF or PS file. Hmmm, something based on this might make an interesting "Final Exam" question: how could one apply dynamic programming approaches learned in first half of course to RNA structure prediction problem? 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 22 2) Comparative sequence analysis (co-variation) Two basic approaches: • Algorithms constrained by initial alignment Much faster, but not as robust as unconstrained Base-pairing probabilities determined by a partition function • Algorithms not constrained by initial alignment Genetic algorithms often used for finding an alignment & set of structures 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 23 RNA Secondary structure prediction: Performance? How evaluate? • Not many experimentally determined structures currently, ~ 50% are rRNA structures so "Gold Standard" (in absence of tertiary structure): compare with predicted RNA secondary structure with that determined by comparative sequence analysis (!!??) using Benchmark Datasets NOTE: Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures! - Gutell, Pace 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 24 RNA Secondary structure prediction: Performance? 1) Energy minimization (via dynamic programming) 73% avg. prediction accuracy - single sequence 2) Comparative sequence analysis 97% avg. prediction accuracy - multiple sequences (e.g., highly conserved rRNAs) much lower if sequence conservation is lower &/or fewer sequences are available for alignment 3) Combined - recent developments: combine thermodynamics & co-variation & experimental constraints? IMPROVED RESULTS 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 25 RNA structure prediction strategies Tertiary structure prediction Requires "craft" & significant user input & insight 1) Extensive comparative sequence analysis to predict tertiary contacts (co-variation) e.g., MANIP - Westhof 2) Use experimental data to constrain model building e.g., MC-CYM - Major 3) Homology modeling using sequence alignment & reference tertiary structure (not many of these!) 4) Low resolution molecular mechanics e.g., yammp - Harvey 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 26 New Today: Protein Structure & Function 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 27 Protein Structure & Function Protein structure - primarily determined by sequence Protein function - primarily determined by structure • Globular proteins: compact hydrophobic core & hydrophilic surface • Membrane proteins: special hydrophobic surfaces • Folded proteins are only marginally stable • Some proteins do not assume a stable "fold" until they bind to something = Intrinsically disordered Predicting protein structure and function can be very hard -- & fun! 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 28 4 Basic Levels of Protein Structure 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 29 Primary & Secondary Structure Primary • Linear sequence of amino acids • Description of covalent bonds linking aa’s Secondary • Local spatial arrangement of amino acids • Description of short-range non-covalent interactions • Periodic structural patterns: -helix, b-sheet 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 30 Tertiary & Quaternary Structure Tertiary • Overall 3-D "fold" of a single polypeptide chain • Spatial arrangement of 2’ structural elements; packing of these into compact "domains" • Description of long-range non-covalent interactions (plus disulfide bonds) Quaternary • In proteins with > 1 polypeptide chain, spatial arrangement of subunits 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 31 "Additional" Structural Levels • • • • Super-secondary elements Motifs Domains Foldons 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 32