Download Biological Sequence Analysis Project 2 Mid

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Biological Sequence Analysis
Project 2 Mid-term Presentation
Exercise 1
2001-20546 한기혁
Multiple sequence alignment
• Target domain : protein C2 domain 1 from pfam database
1) seed 437 sequences
2) full 1087 sequences
Exercise 2
Make HMM
Exercise 3
Search the SWISS-PROT with HMM
1) seed
2) full
Exercise 4
Sequence
Description
Score E-value N
SY63_DISOM Synaptotagmin C (Synaptic vesicle protein 277.9 2.2e-79 2
SYT3_HUMAN Synaptotagmin III (SytIII).
233.6
5e-66 2
Calibrate HMM
• HMMER cyg/win problem – stack error
Exercise 5
Make HMM database
• Myhmms : C2 domain + pkinase protein kainase catalytic domain
Exercise 6
Parse the domain structure of a sequence with HMM database
• Target sequence : Schizosaccharomyces pombe Protein Kinase C-like 1
(PCK1_SCHPO (P36582))
1) seed
2) full
Exercise 7
Model
Domain seq-f seq-t hmm-f hmm-t
pc2dseed 1/1
212 292 .. 1
88 []
pkinase
1/1
664 923 .. 1
278 []
pc2dfull
1/1
212 292 .. 1
200 []
pkinase
1/1
664 923 .. 1
278 []
score
8.6
280.8
19.9
280.8
E-value
8.2e-06
6.1e-85
1.2e-06
6.1e-85
Search for domains of target sequence in PFAM database
Model
HR1
3-alpha
Upf2
Histone_HNS
HR1
IL8
PAH
C2
Atrophin-1
UPF0154
Phosphoprotein
DUF379
DAG_PE-bind
PHD
zf-C3HC4
TNFR_c6
fer4
DUF139
zf-NF-X1
DAG_PE-bind
zf-A20
DC1
DNA_RNApol_7kD
DUF359
MAT_Alpha1
pkinase
ATP-sulfurylase
DUF402
L27
pkinase_C
Domain
1/2
1/1
1/1
1/1
2/2
1/1
1/1
1/1
1/1
1/1
1/1
1/1
1/2
1/1
1/1
1/1
1/1
1/1
1/1
2/2
1/1
1/1
1/1
1/1
1/1
1/1
1/1
1/1
1/1
1/1
seq-f seq-t hmm-f hmm-t score E-value
6 78 ..
1 87 [] 17.6 0.0077
20 64 ..
1 49 [] -4.2
8.3
64 215 .. 1 202 [] -78.9
4.3
117 222 .. 1 130 [] -75.1
3.6
119 192 .. 1 87 [] 98.1 1.4e-26
126 176 .. 1 70 [] -27.6
9.3
132 174 .. 1 47 [] -10.4
7
212 292 .. 1 88 [] 8.6 0.0028
213 943 .. 1 1046 [] -749.7
8.4
374 432 .. 1 64 [] -23.7
3.1
402 694 .. 1 316 [] -145.6
9.1
406 476 .. 1 92 [] -39.3
4.8
414 461 .. 1 51 [] 68.5 1.2e-17
426 464 .. 1 51 [] 6.8 0.0013
427 461 .. 1 50 [] -18.7
5.7
430 469 .. 1 42 [] -8.3
5.1
434 455 .. 1 24 [] -8.3
4.3
440 455 .. 1 17 [] -3.1
6.7
442 463 .. 1 23 [] -4.0
6.5
481 530 .. 1 51 [] 65.0 1.3e-16
491 513 .. 1 26 [] -4.2
6.7
492 523 .. 1 44 [] 0.2
1
494 521 .. 1 32 [] 1.5 0.38
555 701 .. 1 176 [] -89.9
6.5
558 733 .. 1 214 [] -69.9
5.4
664 923 .. 1 294 [] 276.3 3.3e-80
668 903 .. 1 349 [] -197.9
1.9
716 821 .. 1 105 [] -25.9
2.2
891 942 .. 1 56 [] -6.6
8.1
924 988 .] 1 70 [] 125.0 1.1e-34
Discussion
• Target domain and sequence
: Clock-related gene
(BMAL, Clock, Per, DBP, etc)
• Project 1 – HMM model
Related documents