Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Find the optimal alignment ? + Optimal Alignment • Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) • Find a balance between local regions with very good alignments and overall alignment Geometric Matching task = Geometric Pattern Discovery Structure Comparison Requirements 1. Which atom in structure A corresponds to what atom in structure B ? Answer: Sequence alignments THESESENTENCESALIGN--NICLEY ||| ||| || || ||||| |||||| THE—SEQ-EN-CE-ALIGNEDNICELY Structure Comparison Requirements 2. What are the locations of atoms in the structures ? Answer: PDB-files (Dihedral angles, bond lengths …) Chain 1AI9:A bond C-N C-N (PRO) C-O CA-C CA-C (GLY) CA-CB CA-CB (ALA) CA-CB (I,T,V) N-CA N-CA (GLY) N-CA (PRO) total # 180 average 1.32 stddev 0.019 11 192 184 1.33 1.25 1.52 0.019 0.022 0.022 1.29 PRO 68 1.19 ASN 124 1.47 LEU 121 1.36 PRO 160 1.33 GLN 165 1.58 ILE 8 8 133 1.54 1.53 0.016 0.032 1.52 GLY 20 1.4 GLU 174 1.57 GLY 55 1.62 ASP 105 7 1.53 0.019 1.5 ALA 93 44 173 1.56 1.47 0.026 0.023 1.5 VAL 6 1.42 ASP 71 1.61 THR 147 1.54 TRP 189 8 1.47 0.013 1.45 GLY 20 1.49 GLY 180 11 1.47 0.02 1.44 PRO 15 1.5 PRO 152 Source: http://www.rcsb.org/pdb min at 1.27 VAL 6 max at 1.38 ASN 123 1.56 ALA 16 Structure Comparison Requirements 3. Methods to superimpose structures Answer: Translation and Rotation x1, y1, z1 x2, y2, z2 x3, y3, z3 x1 + d, y1, z1 x2 + d, y2, z2 x3 + d, y3, z3 Translation Rotation Transformations Translation x' x t Translation and Rotation Rigid Motion (Euclidian Trans.) x ' Rx t Translation, Rotation + Scaling x' s( Rx t ) Inexact Alignment. Simple case – two closely related proteins with the same number of amino acids. T Question: how to measure an alignment error? Distance Functions Two point sets: A={ai} i=1…n B={bj} j=1…m • Pairwise Correspondence: (ak1,bt1) (ak2,bt2)… (akN,btN) (1) Exact Matching: ||aki – bti||=0 (2) Bottleneck max ||aki – bti|| (3) RMSD (Root Mean Square Distance) Sqrt( Σ||aki – bti||2/N) Superposition - best least squares (RMSD – Root Mean Square Deviation) Given two sets of 3-D points : P={pi}, Q={qi} , i=1,…,n; rmsd(P,Q) = √ S i|pi - qi |2 /n Find a 3-D rigid transformation T* such that: rmsd( T*(P), Q ) = minT √ S i|T(pi) - qi |2 /n A closed form solution exists for this task. It can be computed in O(n) time. RMSD Unit of RMSD => e.g. Ångstroms - identical structures => RMSD = “0” - similar structures => RMSD is small (1 – 3 Å) - distant structures => RMSD > 3 Å Pitfalls of RMSD • all atoms are treated equally (e.g. residues on the surface have a higher degree of freedom than those in the core) • best alignment does not always mean minimal RMSD • significance of RMSD is size dependent Correspondence is Unknown Given two configurations of points in the three dimensional space, T find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding 3-D points. Structure Alignment (Straightforward Algorithm) • For each pair of triplets, one from each molecule which define ‘almost’ congruent triangles compute the rigid transformation that superimposes them. • Count the number of point pairs, which are ‘almost’ superimposed and sort the hypotheses by this number. A 3-D reference frame can be uniquely defined by the ordered vertices of a nondegenerate triangle p1 p2 p3 Improvement : BLAST idea - detect short similar fragments, then extend as much as possible. k+l-1 k t i-1 i+1 i j-1 j j+1 ai-1 ai ai+1 bj-1 bj bj+1 Extend while: rmsd(Fij(k)) <e. Complexity: O(n2) t+l-1 Protein zinc finger (4znf) Superimposed 3znf and 4znf 30 CA atoms RMS = 0.70Å 248 atoms RMS = 1.42Å Lys30 Superimposed 3znf and 4znf backbones 30 30 CA CAatoms atoms RMS RMS == 0.70Å 0.70Å 248 atoms RMS = 1.42Å Lys30