Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Biological Sequence Analysis Project 2 Mid-term Presentation Exercise 1 2001-20546 한기혁 Multiple sequence alignment • Target domain : protein C2 domain 1 from pfam database 1) seed 437 sequences 2) full 1087 sequences Exercise 2 Make HMM Exercise 3 Search the SWISS-PROT with HMM 1) seed 2) full Exercise 4 Sequence Description Score E-value N SY63_DISOM Synaptotagmin C (Synaptic vesicle protein 277.9 2.2e-79 2 SYT3_HUMAN Synaptotagmin III (SytIII). 233.6 5e-66 2 Calibrate HMM • HMMER cyg/win problem – stack error Exercise 5 Make HMM database • Myhmms : C2 domain + pkinase protein kainase catalytic domain Exercise 6 Parse the domain structure of a sequence with HMM database • Target sequence : Schizosaccharomyces pombe Protein Kinase C-like 1 (PCK1_SCHPO (P36582)) 1) seed 2) full Exercise 7 Model Domain seq-f seq-t hmm-f hmm-t pc2dseed 1/1 212 292 .. 1 88 [] pkinase 1/1 664 923 .. 1 278 [] pc2dfull 1/1 212 292 .. 1 200 [] pkinase 1/1 664 923 .. 1 278 [] score 8.6 280.8 19.9 280.8 E-value 8.2e-06 6.1e-85 1.2e-06 6.1e-85 Search for domains of target sequence in PFAM database Model HR1 3-alpha Upf2 Histone_HNS HR1 IL8 PAH C2 Atrophin-1 UPF0154 Phosphoprotein DUF379 DAG_PE-bind PHD zf-C3HC4 TNFR_c6 fer4 DUF139 zf-NF-X1 DAG_PE-bind zf-A20 DC1 DNA_RNApol_7kD DUF359 MAT_Alpha1 pkinase ATP-sulfurylase DUF402 L27 pkinase_C Domain 1/2 1/1 1/1 1/1 2/2 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/2 1/1 1/1 1/1 1/1 1/1 1/1 2/2 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 seq-f seq-t hmm-f hmm-t score E-value 6 78 .. 1 87 [] 17.6 0.0077 20 64 .. 1 49 [] -4.2 8.3 64 215 .. 1 202 [] -78.9 4.3 117 222 .. 1 130 [] -75.1 3.6 119 192 .. 1 87 [] 98.1 1.4e-26 126 176 .. 1 70 [] -27.6 9.3 132 174 .. 1 47 [] -10.4 7 212 292 .. 1 88 [] 8.6 0.0028 213 943 .. 1 1046 [] -749.7 8.4 374 432 .. 1 64 [] -23.7 3.1 402 694 .. 1 316 [] -145.6 9.1 406 476 .. 1 92 [] -39.3 4.8 414 461 .. 1 51 [] 68.5 1.2e-17 426 464 .. 1 51 [] 6.8 0.0013 427 461 .. 1 50 [] -18.7 5.7 430 469 .. 1 42 [] -8.3 5.1 434 455 .. 1 24 [] -8.3 4.3 440 455 .. 1 17 [] -3.1 6.7 442 463 .. 1 23 [] -4.0 6.5 481 530 .. 1 51 [] 65.0 1.3e-16 491 513 .. 1 26 [] -4.2 6.7 492 523 .. 1 44 [] 0.2 1 494 521 .. 1 32 [] 1.5 0.38 555 701 .. 1 176 [] -89.9 6.5 558 733 .. 1 214 [] -69.9 5.4 664 923 .. 1 294 [] 276.3 3.3e-80 668 903 .. 1 349 [] -197.9 1.9 716 821 .. 1 105 [] -25.9 2.2 891 942 .. 1 56 [] -6.6 8.1 924 988 .] 1 70 [] 125.0 1.1e-34 Discussion • Target domain and sequence : Clock-related gene (BMAL, Clock, Per, DBP, etc) • Project 1 – HMM model