Download p 8-3 - straubel

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
WSSP Chapter 8
BLASTX Translated DNA vs Protein searches
atttaccgtg
tgatgagtat
ccggaaatag
acttcaatga
ttgattaata
ttggattgaa
gatacagttt
gatcccgatc
ttggttctaa
tttccatttc
attatcttgc
tccgtattaa
atgattgctt
gcattcgaat
tgtcccagtt
atgagccagc
taacgaacgg
caatattttc
gcgtacccgt
tttaattttc
© 2014 WSSP
8-3
Why do a BLASTX if we have done a BLASTN?
BLASTN Match
Query AGG TCG TTA CTA TCG AGG AGT AGA
|
|
|
Sbjct CGT AGC CTT TTG AGT CGA TCG CGG
16% Identity
R
S
L
L
S
R S
Query AGG TCG TTA CTA TCG AGG AGT AGA
|
|
|
Sbjct CGT AGC CTT TTG AGT CGA TCG CGG
R
S
L
L
S
R
S
R S L L S R S R
| | | | | | | |
R S L L S R S R
© 2014 WSSP
|
R
|
R
100% Identity
8-1
Clicker Question #1: How many different DNA sequences
can code for the peptide sequence Met-Leu-Cys-Ala?
A) 1
B) 12
C) 36
D) 48
E) 54
3 Letter
NAME
Abbreviation
Alanine
Ala
Cysteine
Cys
Histidine
His
Isoleucine
Ile
Lysine
Lys
Leucine
Leu
Methionine
Met
Asparagine
Asn
Proline
Pro
Glutamine
Gln
1 Letter
Abbreviation
A
C
H
I
K
L
M
N
P
Q
DNA codons for each Amino Acids
GCA,GCC,GCG,GCU
UGC,UGU
CAC,CAU
AUA,AUC,AUU
AAA,AAG
UUA,UUG,CUA,CUC,CUG,CUU
AUG
AAC,AAU
CCA,CCC,CCG,CCU
CAA,CAG
Number of codons for the conserved region of the protein
3422242262422412263622666262232446622244246216
ITKNYPYYRTADKGWQNSIRHNLSLNRYFIKVPRSQEEPGKGSFWR
3 Letter
1 Letter
Abbreviation Abbreviation
Alanine
Ala
A
Cysteine
Cys
C
Aspartic Acid
Asp
D
Glutamic Acid
Glu
E
Phenylalanine
Phe
F
Glycine
Gly
G
Histidine
His
H
Isoleucine
Ile
I
Lysine
Lys
K
Leucine
Leu
L
Methionine
Met
M
Asparagine
Asn
N
Proline
Pro
P
Glutamine
Gln
Q
Arginine
Arg
R
Serine
Ser
S
Threonine
Thr
T
Valine
Val
V
Tryptophan
Trp
W
Tyrosine
Tyr
Y
Stop Codons
.
© 2014 WSSP
DNA codons for each Amino Acids NAME
GCA,GCC,GCG,GCU
UGC,UGU
GAC,GAU
GAA,GAG
UUC,UUU
GGA,GGC,GGG,GGU
CAC,CAU
AUA,AUC,AUU
AAA,AAG
UUA,UUG,CUA,CUC,CUG,CUU
AUG
AAC,AAU
CCA,CCC,CCG,CCU
CAA,CAG
CGA,CGC,CGG,CGU,AGA,AGG
UCA,UCC,UCG,UCU,AGC,AGU
ACA,ACC,ACG,ACU
GUA,GUC,GUG,GUU
UGG
UAC,UAU
UAA,UAG,UGA
7.5 x 1019
© 2014 WSSP
8-2
© 2014 WSSP
p. 8-2
DSAP BLASTx Page
NCBI BLASTx page
© 2014 WSSP
Cropped DNA
sequence
p. 7-2
BLASTX Dialog Box
© 2014 WSSP
p 8-3
BLASTX of EX1.14
© 2014 WSSP
p 8-3
BLASTn and BLASTx of another Landoltia sequence
BLASTn
BLASTx
© 2014 WSSP
p 8-4
List of EX1.14 BLASTx matches
© 2014 WSSP
p 8-4
Best BLASTx alignment for EX1.14
© 2014 WSSP
p 8-5
Low Sequence Complexity Filter
>gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis]
Score = 69.7 bits (169), Expect = 3e-10
Identities = 54/174 (31%), Positives = 85/174 (48%), Gaps = 4/174 (2%)
With
Filter
Query
40
Sbjct
9
Query
220
Sbjct
68
Query
400
Sbjct
128
LTCLLILQAPSSHAFYLWppfffpspvpDVITVLNQANQFTTLVQLLTETGVATAVNAIS
LT L++L + + A
P
PS
+V +L++ QFTT ++LLT T VAT +
LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q
219
TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP
N
G T+FAPTD AF + A L+ L+ Q+ ++ H
+FYT + L +
+
LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR
399
TLNT---GrsltfstsvtrvttitsPGGRVTTLNFLLYRRFPLTIFPIADVLLP
T T
G
+ +
S G
T +N + ++FPL ++ + VLLP
TQATGQDGGVFGLNFTGQANQVNVSTGIVETQINNAIRQQFPLALYQVDKVLLP
67
127
552
181
>gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis]
Score = 82.4 bits (202), Expect = 4e-14
Identities = 57/176 (32%), Positives = 89/176 (50%), Gaps = 8/176 (4%)
Frame = +1
Without
Filter
© 2014 WSSP
Query
40
Sbjct
9
Query
220
Sbjct
68
Query
400
Sbjct
128
LTCLLILQAPSSHAFYLWPPFFFPSPVPDVITVLNQANQFTTLVQLLTETGVATAVNAIS
LT L++L + + A
P
PS
+V +L++ QFTT ++LLT T VAT +
LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q
219
TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP
N
G T+FAPTD AF + A L+ L+ Q+ ++ H
+FYT + L +
+
LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR
399
TLNTGR-----SLTFSTSVTRVTTITSPGGRVTTLNFLLYRRFPLTIFPIADVLLP
T TG+
L F+
+V
S G
T +N + ++FPL ++ + VLLP
TQATGQDGGVFGLNFTGQANQVN--VSTGIVETQINNAIRQQFPLALYQVDKVLLP
552
181
67
127
Answer questions in DSAP
© 2014
2013 WSSP
p 8-6
Question: Which of these alignments has a greater biological
significance?
A)
B)
© 2014 WSSP
What can you
conclude about
this BLASTX
result?
A) It is too short to be significant
© 2014 WSSP
B) It does not match anything
C) There is a frame shift in the DNA sequence
D) Your DNA has an exact match
Where is the
frameshift
most likely to
be found?
A) bp 181
B) Bp 75
C) bp 227
D) bp 381
E) Can not tell from the data
© 2014 WSSP
Points at when
an error can
be introduced
into the DNA
sequence of
the clone
DNA
RNA
cDNA
AAAAAAAA
TTTTTTTTT
DS-cDNA
AAAAAAAA
TTTTTTTTT
Cloning
Replication
&
Purification
Sequencing
© 2014 WSSP
AAAAAAAA
AAAAAAAA
TTTTTTTTT
Is the frame shift at bp 227 caused by a DNA
sequencing error?
A) Yes
B) No
C) Can not tell from the data
© 2014 WSSP
Does this have a frame shift?
Where?
© 2014 WSSP
What does this BLASTX report indicate?
A) There are matches to different proteins at the end of
the sequence
B) There are matches in one frame to the entire sequence
C) There is a frame shift in the DNA sequence
D) The protein has two different domains
E) Can not conclude anything
Where is the frame shift?
© 2014 WSSP
A)
B)
C)
D)
E)
bp 149
Bp 160
bp 458
bp 469
bp 493
Does this indicate
that there is a
frame shift in the
sequence?
A) Yes
B) No
C) Can not tell
+1
+3
+1
from the data
+1
© 2014 WSSP
Intron
+3
What is the
most likely
explanation for
this result?
A) There is nothing
wrong with the
alignment.
B) There is an extra or
missing base
causing a frame
shift.
C) There is an
unspliced intron in
the cDNA.
D) The query has an
extra protein region.
E) Answers C or D
© 2014 WSSP
Related documents