Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Proteins Proteins control the biological functions of cellular organisms e.g. metabolism, blood clotting, immune system Building blocks – amino acids amino group (NH2), carboxyl group (COOH), side chain R 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 The Protein Data Bank 60000 50000 40000 30000 Yearly Total 20000 10000 0 Protein sequence and structure Protein alphabet consists of 20 amino acids Sequence view ADKELKFLVVDDFSTMRRIV..... Structure view Protein structure and function Function is determined by 3D shape/structure Thrombin Facilitates blood clotting Hirudin Anticoagulant (blocks active site) Protein structure and function Structure conserves better evolution information Myoglobin family 1MBC: VLSEGEWQLVLHVWAKVE..... 2FAL: XSLSAAEADLAGKSWAPV..... Structural Bioinformatics Pairwise alignment algorithms DALI (Holm and Sander, Journal of Molecular Biology, 1993) LOCK (Singh and Brutlag, ISMB, 1997) CE (Shindyalov and Bourne, Protein Engineering, 1998) SSM (Krissinel and Henrick, Acta Cryst., 2004) Ye et al. JBCB, 2004 Multiple alignment algorithms Gerstein and Levitt, ISMB, 1996: Iterative dynamic programming SSAP (Orengo and Taylor, Methods Enymol., 1996): Two-level DP Leibowitz et al., ISMB, 1999): Geometric hashing CE-MC (Guda et al., PSB, 2001) MAMMOTH (Lupyan et al., Bioinformatics, 2005) MAPSCI (Ye at al., WABI, 2006) Structural Bioinformatics Homology detection Hidden Markov models (Jaakola et al., JCB, 2000) Spectrum, Mismatch kernel (Leslie et al., Bioinformatics, 2002) Structure kernel (Qiu et al., Bioinformatics, 2007) Protein structure prediction Jones and Hadley, Bioinformatics: Sequence, structure and databanks. 2000. FUGUE (Shi et al., J. Mol. Biol., 2001) SCOP (Andreeva, Nucleic Acids Res., 2004) Protein docking Shoichet et al., J. Comput. Chem., 1992. Choi et al., WABI, 2004. Wang et al., PSB, 2005. Sousa et al., Proteins, 2006. Pairwise Structure Alignment Given two proteins represented by the Cα atoms (backbone) find 3D transformation that superimposes a large number of the Cα atoms ensure that overall distance between matched pairs is as small as possible Trade-off between number of matches and total distance between Pairwise Structure Alignment Ye et al. JBCB 2004 Uses orientation independent representation of proteins based on the fact that Cα atoms are ~4 Ǻ apart Pairwise Structure Alignment Ye et al. JBCB 2004 The protein is represented as a sequence of angle triplets {(α1, β1, γ1), (α2, β2, γ2), …, (αn, βn, γn) } Pairwise Structure Alignment Ye et al. JBCB 2004 Compute a local alignment based on angle representation Find maximal subset of runs with similar transformation matrices Pairwise Structure Alignment Ye et al. JBCB 2004 The main algorithm Compute the angle based representation Align the angle based representation Identify runs with similar transformation matrices Compute initial structural alignment Refine the alignment iteratively Running time is ~(m+n)2 where m, n are the protein lengths Multiple Structure Alignment Given a set of proteins represented by the Cα atoms (backbone) find a simultaneous alignment of all structures find a consensus structure that represents all of them Multiple Structure Alignment The main algorithm find initial consensus structure (one of the given proteins) pairwise align the consensus and each of the proteins merge the pairwise alignments from previous step recompute the consensus protein; repeat from step 2 Merging the pairwise alignments similar to sequence case P1 = BBCA, P2 = CBBA, P3 = BCCA P1: -BBCA, P1:= BBCA P2: CBB-A, P3:= BCCA P: -BBCA P: CBB-A P: -BCCA Multiple Structure Alignment Computation of consensus structure (after merging alignments) Multiple Structure Alignment Algorithm flowchart