Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
WSSP Chapter 7
BLASTN: DNA vs DNA searches
atttaccgtg
tgatgagtat
ccggaaatag
acttcaatga
ttgattaata
ttggattgaa
gatacagttt
gatcccgatc
ttggttctaa
tttccatttc
attatcttgc
tccgtattaa
atgattgctt
gcattcgaat
tgtcccagtt
atgagccagc
taacgaacgg
caatattttc
gcgtacccgt
tttaattttc
© 2014 WSSP
DSAP: BLASTn Page
© 2014 WSSP
p. 7-1
NCBI BLAST Home Page
© 2014 WSSP
p. 7-1
NCBI BLASTN search page
© 2013 WSSP
p. 7-2
Copy sequence from DSAP or wave form program
© 2014 WSSP
p. 7-2
Choose a database (nr/nt or est)
© 2014 WSSP
p. 7-3
Search options (Use defaults)
© 2014 WSSP
p. 7-4
BLASTN progress report (search may take a few minutes)
© 2014 WSSP
p. 7-5
Format options (use defaults)
© 2014 WSSP
p. 7-5
EX1.14 BLASTN nr/nt database
© 2014 WSSP
p. 7-6
Graphic report of EX2.09
© 2014 WSSP
p. 7-7
BLASTN list of matches for EX1.14
© 2014 WSSP
p. 7-7
EX2.09
BLASTN
© 2014 WSSP
p. 7-9
Clicker Question: Which match is the most meaningful?
A)
B)
C)
D)
E) None
© 2014 WSSP
Clicker Question: Which part of the gene appears to be
the most conserved?
A) Bp 1-100
B) Bp 100-300
C) Bp 300-500
D) All
E) None
© 2014 WSSP
Clicker Question: The
entire insert of a clone
was sequenced and a
BLASTN search was
performed. Are these
matches likely to be
significant?
A) Yes
B) No
C) Can not tell from data
© 2014 WSSP
Question: Which of the following E values indicates the
best match?
A) 1e-10
B) 5e-91
C) 5.3
D) 0.0
E) Can not tell from this data
© 2014 WSSP
Best match to EX1.14
Length of
sequence
Our Seq.
Database
Seq.
Mismatch
Match
© 2014 WSSP
p. 7-9
Perfect, but short, matches are not
usually meaningful
>gi|14250883|emb|AL583809.3|CNS07EFY Human chromosome 14 DNA
sequence BAC R-736L22 of library RPCI-11 from chromosome 14 of
Homo sapiens (Human), complete sequence
Score = 40.1 bits (20), Expect = 4.6 Identities = 20/20 (100%)
Query: 189
ttttctgaatattcataata 208
||||||||||||||||||||
Sbjct: 60645 ttttctgaatattcataata 60626
© 2014 WSSP
7-11
Examine the best alignments:
Are they significant?
© 2014 WSSP
7-9
Mismatches
i)
Bad sequence on our part
ii)
Bad sequence on their part
iii) Differences in the sequence of the two organisms
Query
Sbjct
C
TGT
|||
TGT
C
R
CGT
|||
CGT
R
E
GAA
|||
GAA
E
L
CTC
||
CTT
L
L
CTA
||
CTG
L
I
ATT
||
ATC
I
L
CTC
||
CTT
L
D
GAC
||
GAT
D
A
GCC
||
GCA
A
Wobble position: same amino acid,
but different codon….degenerate code
Query:
Sbjct:
© 2014 WSSP
383 AGCGTTGCCGTTCGTCAGCTTGATGTTAAGCTGGGCAGCGCGCTCGACGATTCCTTTGCG 324
|||||| |||||||||||||||||||| | ||| || ||||||||||||||||| |||||
6152 AGCGTTTCCGTTCGTCAGCTTGATGTTCAACTGAGCGGCGCGCTCGACGATTCCCTTGCG 6211
Small Gaps- alter the reading frame of the protein
Query
Sbjct
© 2014 WSSP
C R
R T P D P *
TGTCGT-CGAACTCCTGATCCTTGA
|||||| ||||||||||||||||||
TGTCGTCCGAACTCCTGATCCTTGA
C R E L L I L D
p. 7-13
An example of a match with and without gaps.
Query:
Sbjct:
Query:
Sbjct:
© 2014 WSSP
179 TTCGAGCTACCAGATGATC-GATTGGAACAT-T-C--TGTCATTG-AC-CTTC-AGGTAA 230
||||||| || | | || |||| || || | | | | ||| | |||| |||| |
4684 TTCGAGCG-CC-GTTAATATGATTACAATATCTACAATATTATTATATGCTTCCAGGTGA 4741
231 TCAACCATGACCGTGTCAACCGAAACGACGTTATCGGCCGTGCACTATTGAACATGGAGG 290
|||| ||||||||||| ||||| || || || || |||||||| || | || ||||| |
4742 TCAATCATGACCGTGTTAACCGTAATGATGTAATTGGCCGTGCCCTTCTTAATATGGAAG 4801
p. 7-13
Alignment of the third best match to EX1.14
>gi|241990611|dbj|AK330768.1| Triticum aestivum cDNA, clone: SET5_E05, cultivar:
Chinese Spring Length=650
Score = 219 bits (242), Expect = 2e-53
Identities = 211/271 (77%), Gaps = 0/271 (0%)
Query
10
Sbjct
78
Query
70
Sbjct
138
Query
130
Sbjct
198
Query
190
Sbjct
258
Query
250
Sbjct
318
© 2014 WSSP
GATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGA
|||| ||||||||| ||||| || || |||||||||||||||
||||||||| | |
GATGCTGGAAGGGAAGGCGACGGTGGAGGACACCGACATGCCGGCCAAGATGCAGCTGCA
69
GGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCT
|||||
|| || ||
|||||||| | |||||||||
|||||| |||| |
GGCCACCTCGGCGGCGTCCAGGGCGCTCGAACGCTTCGACGTCCTCGACTGCCGGAGCAT
129
CGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGT
||| ||||| ||||||||||| || || | |||| |||| ||||| ||||||||||| ||
CGCGGCGCACATCAAGAAGGAGTTCGACACGATCCACGGCCCGGGGTGGCAGTGCGTGGT
189
CGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCT
|||| |||||||||||| | |||||| |||| || || |||||||| ||||||
||
GGGCTGCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATATACTTCAAGCT
249
GGAGACGCTCCACTTCCTCATCTTCAAAGGC
||| |||||| |||||| |||||||||||
CGAGTCGCTCCGGTTCCTCGTCTTCAAAGGC
137
197
257
317
280
348
p. 7-14
Alignments near the end of the EX1.13
>gi|254826767|ref|NG_012498.1| Homo sapiens glypican 4 (GPC4),
RefSeqGene on chromosome X Length=121142 Score = 71.6 bits (78),
Expect = 6e-09 Identities = 42/44 (95%), Gaps = 0/44 (0%)
© 2014 WSSP
Query
665
Sbjct
72886
CTAGCTTTTCTTAACaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
|| ||||||||||| |||||||||||||||||||||||||||||
CTTGCTTTTCTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
708
72929
p. 7-14
Question: Is this match biologically
significant?
A)Yes
B)No
© 2014 WSSP
C)Can not tell from data
Question: Is this match biologically
significant?
A)Yes
B)No
© 2014 WSSP
C)Can not tell from data
Clicker Question: Is this match likely in
a protein coding region?
A) Yes
B) No
C) Can not tell from data
© 2014 WSSP
Clicker Question: What is the likely explanation for the gap?
A) Sequence error in cDNA
B) Error in making the cDNA
C) Start of an intron region
D) Can not tell from data
© 2014 WSSP
E) A, B or C
Clicker Question: Is this match likely in
a protein coding region?
A) Yes
B) No
C) Can not tell from data
© 2014 WSSP
Fill in the table listing the best matches from
three different organisms.
List Landoltia if there is a match
© 2014 WSSP
p. 7-15
Use the
clone report
to obtain
more
information
about the
gene
© 2014 WSSP
p. 7-15
Is this
a
signific
ant
match?
a) Yes
b) No
© 2014 WSSP
p. 7-16
3) Perform a
BLASTn of
the est
database
Change
the
database
© 2014 WSSP
p. 7-17
BLASTn report of the EX1.14 search
of the est database
© 2014 WSSP
p. 7-17
Alignment of the best match to EX1.13
from the est search
>gi|198335694|gb|GD004539.1| CCHY28888.g1 CCHY Panicum virgatum callus (N) Panicum virgatum
cDNA clone CCHY28888 3', mRNA sequence. Length=624
Score = 246 bits (272), Expect = 1e-61
Identities = 226/286 (79%), Gaps = 0/286 (0%)
Strand=Plus/Minus
Query
3
Sbjct
527
Query
63
Sbjct
467
Query
123
Sbjct
407
Query
183
Sbjct
347
Query
243
Sbjct
287
© 2014 WSSP
GAGAGAAGATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGC
|||| | ||| ||||||||| ||||| || || ||||| ||||||||| ||||||||
GAGACACCATGCTGGAAGGGAAGGCGATGGTGGAGGACACGGACATGCCGGCGAAGATGC
62
AGGCGGAGGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCA
||||| |||| |||
|| || ||
|| ||||| | |||||||||
||||||
AGGCGCAGGCGATGGCGGCGGCGTCCAGGGCCCTCGACCGCTTCGACGTCCTCGACTGCC
122
AGAGCCTCGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGT
|||| |||| ||||| ||||||||||| ||||| | |||| |||| || || ||||| |
GGAGCATCGCGGCGCACATCAAGAAGGAGTTTGACACGATCCACGGCCCCGGGTGGCAAT
182
GCGTCGTCGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACT
|||| || ||||||||||||||||| | |||||| |||| || || |||||||||||||
GCGTGGTGGGCTCCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATCTACT
242
TCCGCCTGGAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGC
|||| || ||| |||||
||||||||||||||||| ||||| ||
TCCGGCTCGAGTCGCTCAGGTTCCTCATCTTCAAAGGGGCGGCAGC
468
408
348
288
288
242
p. 7-17
Fill out the DSAP table of the BLASTn
search of the est database
© 2014 WSSP
p. 7-18
Open Question: Why are there differences in the sequences?
Query
61
Sbjct
13166
Query
121
Sbjct
13106
© 2014 WSSP
CAAGGTCTAAGTACTGAAAAGGAAAGTCTACTAATTACAAAGAAGTTATTGTTTGTACCT
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
CAAGGTCTAAGTACTGAAAAGGAAAGTCCACTAATTACAAAGAAGTTATTGTTTGTACCT
120
TTTGTATCAGGGTTTATTAAATTTCAATCTTTATTGCTGAATCCCGAAACAAGGTGATCT
|||||||||||||||||||||||| |||||| ||||||||||||||||||||||||||||
TTTGTATCAGGGTTTATTAAATTTTAATCTTCATTGCTGAATCCCGAAACAAGGTGATCT
180
13107
13047
Q5. BLASTn Analysis: Is your cDNA similar to genes in
other organisms?
© 2014 WSSP
p. 7-16
Q6. BLASTn Analysis: Is your cDNA similar to genes in
different kingdoms?
i.e. are there any matches to organisms from
the eubacteria, archabacteria, protist, fungi, or
animal kingdoms or are they all matches to
other plants?
© 2014 WSSP
p. 7-16
Is the sequence found in many other organisms?
!
© 2014 WSSP
© 2014 WSSP
Related documents