Download Multiple Structure Alignment

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Proteins
 Proteins control the biological functions of cellular organisms
 e.g. metabolism, blood clotting, immune system
 Building blocks – amino acids
 amino group (NH2), carboxyl group (COOH), side chain R
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
The Protein Data Bank
60000
50000
40000
30000
Yearly
Total
20000
10000
0
Protein sequence and structure
 Protein alphabet consists of 20 amino acids
Sequence view
ADKELKFLVVDDFSTMRRIV.....
Structure view
Protein structure and function
 Function is determined by 3D shape/structure
Thrombin
Facilitates blood clotting
Hirudin
Anticoagulant
(blocks active site)
Protein structure and function
 Structure conserves better evolution information
Myoglobin family
1MBC: VLSEGEWQLVLHVWAKVE.....
2FAL: XSLSAAEADLAGKSWAPV.....
Structural Bioinformatics
 Pairwise alignment algorithms





DALI (Holm and Sander, Journal of Molecular Biology, 1993)
LOCK (Singh and Brutlag, ISMB, 1997)
CE (Shindyalov and Bourne, Protein Engineering, 1998)
SSM (Krissinel and Henrick, Acta Cryst., 2004)
Ye et al. JBCB, 2004
 Multiple alignment algorithms






Gerstein and Levitt, ISMB, 1996: Iterative dynamic programming
SSAP (Orengo and Taylor, Methods Enymol., 1996): Two-level DP
Leibowitz et al., ISMB, 1999): Geometric hashing
CE-MC (Guda et al., PSB, 2001)
MAMMOTH (Lupyan et al., Bioinformatics, 2005)
MAPSCI (Ye at al., WABI, 2006)
Structural Bioinformatics
 Homology detection
 Hidden Markov models (Jaakola et al., JCB, 2000)
 Spectrum, Mismatch kernel (Leslie et al., Bioinformatics, 2002)
 Structure kernel (Qiu et al., Bioinformatics, 2007)
 Protein structure prediction
 Jones and Hadley, Bioinformatics: Sequence, structure and databanks. 2000.
 FUGUE (Shi et al., J. Mol. Biol., 2001)
 SCOP (Andreeva, Nucleic Acids Res., 2004)
 Protein docking




Shoichet et al., J. Comput. Chem., 1992.
Choi et al., WABI, 2004.
Wang et al., PSB, 2005.
Sousa et al., Proteins, 2006.
Pairwise Structure Alignment
 Given two proteins represented by the Cα atoms (backbone)
 find 3D transformation that superimposes a large number of the Cα atoms
 ensure that overall distance between matched pairs is as small as possible
 Trade-off between number of matches and total distance between
Pairwise Structure Alignment
Ye et al. JBCB 2004
 Uses orientation independent representation of proteins based on
the fact that Cα atoms are ~4 Ǻ apart
Pairwise Structure Alignment
Ye et al. JBCB 2004
 The protein is represented as a sequence of angle triplets
{(α1, β1, γ1), (α2, β2, γ2), …, (αn, βn, γn) }
Pairwise Structure Alignment
Ye et al. JBCB 2004
 Compute a local alignment based on angle representation
 Find maximal subset of runs with similar transformation matrices
Pairwise Structure Alignment
Ye et al. JBCB 2004
 The main algorithm





Compute the angle based representation
Align the angle based representation
Identify runs with similar transformation matrices
Compute initial structural alignment
Refine the alignment iteratively
 Running time is ~(m+n)2 where m, n are the protein lengths
Multiple Structure Alignment
 Given a set of proteins represented by the Cα atoms (backbone)
 find a simultaneous alignment of all structures
 find a consensus structure that represents all of them
Multiple Structure Alignment
 The main algorithm




find initial consensus structure (one of the given proteins)
pairwise align the consensus and each of the proteins
merge the pairwise alignments from previous step
recompute the consensus protein; repeat from step 2
 Merging the pairwise alignments similar to sequence case
P1 = BBCA, P2 = CBBA, P3 = BCCA
P1: -BBCA, P1:= BBCA
P2: CBB-A, P3:= BCCA
P: -BBCA
P: CBB-A
P: -BCCA
Multiple Structure Alignment
 Computation of consensus structure (after merging alignments)
Multiple Structure Alignment
 Algorithm flowchart
Related documents