Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
WSSP Chapter 7 BLASTN: DNA vs DNA searches atttaccgtg tgatgagtat ccggaaatag acttcaatga ttgattaata ttggattgaa gatacagttt gatcccgatc ttggttctaa tttccatttc attatcttgc tccgtattaa atgattgctt gcattcgaat tgtcccagtt atgagccagc taacgaacgg caatattttc gcgtacccgt tttaattttc © 2014 WSSP DSAP: BLASTn Page © 2014 WSSP p. 7-1 NCBI BLAST Home Page © 2014 WSSP p. 7-1 NCBI BLASTN search page © 2013 WSSP p. 7-2 Copy sequence from DSAP or wave form program © 2014 WSSP p. 7-2 Choose a database (nr/nt or est) © 2014 WSSP p. 7-3 Search options (Use defaults) © 2014 WSSP p. 7-4 BLASTN progress report (search may take a few minutes) © 2014 WSSP p. 7-5 Format options (use defaults) © 2014 WSSP p. 7-5 EX1.14 BLASTN nr/nt database © 2014 WSSP p. 7-6 Graphic report of EX2.09 © 2014 WSSP p. 7-7 BLASTN list of matches for EX1.14 © 2014 WSSP p. 7-7 EX2.09 BLASTN © 2014 WSSP p. 7-9 Clicker Question: Which match is the most meaningful? A) B) C) D) E) None © 2014 WSSP Clicker Question: Which part of the gene appears to be the most conserved? A) Bp 1-100 B) Bp 100-300 C) Bp 300-500 D) All E) None © 2014 WSSP Clicker Question: The entire insert of a clone was sequenced and a BLASTN search was performed. Are these matches likely to be significant? A) Yes B) No C) Can not tell from data © 2014 WSSP Question: Which of the following E values indicates the best match? A) 1e-10 B) 5e-91 C) 5.3 D) 0.0 E) Can not tell from this data © 2014 WSSP Best match to EX1.14 Length of sequence Our Seq. Database Seq. Mismatch Match © 2014 WSSP p. 7-9 Perfect, but short, matches are not usually meaningful >gi|14250883|emb|AL583809.3|CNS07EFY Human chromosome 14 DNA sequence BAC R-736L22 of library RPCI-11 from chromosome 14 of Homo sapiens (Human), complete sequence Score = 40.1 bits (20), Expect = 4.6 Identities = 20/20 (100%) Query: 189 ttttctgaatattcataata 208 |||||||||||||||||||| Sbjct: 60645 ttttctgaatattcataata 60626 © 2014 WSSP 7-11 Examine the best alignments: Are they significant? © 2014 WSSP 7-9 Mismatches i) Bad sequence on our part ii) Bad sequence on their part iii) Differences in the sequence of the two organisms Query Sbjct C TGT ||| TGT C R CGT ||| CGT R E GAA ||| GAA E L CTC || CTT L L CTA || CTG L I ATT || ATC I L CTC || CTT L D GAC || GAT D A GCC || GCA A Wobble position: same amino acid, but different codon….degenerate code Query: Sbjct: © 2014 WSSP 383 AGCGTTGCCGTTCGTCAGCTTGATGTTAAGCTGGGCAGCGCGCTCGACGATTCCTTTGCG 324 |||||| |||||||||||||||||||| | ||| || ||||||||||||||||| ||||| 6152 AGCGTTTCCGTTCGTCAGCTTGATGTTCAACTGAGCGGCGCGCTCGACGATTCCCTTGCG 6211 Small Gaps- alter the reading frame of the protein Query Sbjct © 2014 WSSP C R R T P D P * TGTCGT-CGAACTCCTGATCCTTGA |||||| |||||||||||||||||| TGTCGTCCGAACTCCTGATCCTTGA C R E L L I L D p. 7-13 An example of a match with and without gaps. Query: Sbjct: Query: Sbjct: © 2014 WSSP 179 TTCGAGCTACCAGATGATC-GATTGGAACAT-T-C--TGTCATTG-AC-CTTC-AGGTAA 230 ||||||| || | | || |||| || || | | | | ||| | |||| |||| | 4684 TTCGAGCG-CC-GTTAATATGATTACAATATCTACAATATTATTATATGCTTCCAGGTGA 4741 231 TCAACCATGACCGTGTCAACCGAAACGACGTTATCGGCCGTGCACTATTGAACATGGAGG 290 |||| ||||||||||| ||||| || || || || |||||||| || | || ||||| | 4742 TCAATCATGACCGTGTTAACCGTAATGATGTAATTGGCCGTGCCCTTCTTAATATGGAAG 4801 p. 7-13 Alignment of the third best match to EX1.14 >gi|241990611|dbj|AK330768.1| Triticum aestivum cDNA, clone: SET5_E05, cultivar: Chinese Spring Length=650 Score = 219 bits (242), Expect = 2e-53 Identities = 211/271 (77%), Gaps = 0/271 (0%) Query 10 Sbjct 78 Query 70 Sbjct 138 Query 130 Sbjct 198 Query 190 Sbjct 258 Query 250 Sbjct 318 © 2014 WSSP GATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGA |||| ||||||||| ||||| || || ||||||||||||||| ||||||||| | | GATGCTGGAAGGGAAGGCGACGGTGGAGGACACCGACATGCCGGCCAAGATGCAGCTGCA 69 GGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCT ||||| || || || |||||||| | ||||||||| |||||| |||| | GGCCACCTCGGCGGCGTCCAGGGCGCTCGAACGCTTCGACGTCCTCGACTGCCGGAGCAT 129 CGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGT ||| ||||| ||||||||||| || || | |||| |||| ||||| ||||||||||| || CGCGGCGCACATCAAGAAGGAGTTCGACACGATCCACGGCCCGGGGTGGCAGTGCGTGGT 189 CGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCT |||| |||||||||||| | |||||| |||| || || |||||||| |||||| || GGGCTGCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATATACTTCAAGCT 249 GGAGACGCTCCACTTCCTCATCTTCAAAGGC ||| |||||| |||||| ||||||||||| CGAGTCGCTCCGGTTCCTCGTCTTCAAAGGC 137 197 257 317 280 348 p. 7-14 Alignments near the end of the EX1.13 >gi|254826767|ref|NG_012498.1| Homo sapiens glypican 4 (GPC4), RefSeqGene on chromosome X Length=121142 Score = 71.6 bits (78), Expect = 6e-09 Identities = 42/44 (95%), Gaps = 0/44 (0%) © 2014 WSSP Query 665 Sbjct 72886 CTAGCTTTTCTTAACaaaaaaaaaaaaaaaaaaaaaaaaaaaaa || ||||||||||| ||||||||||||||||||||||||||||| CTTGCTTTTCTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 708 72929 p. 7-14 Question: Is this match biologically significant? A)Yes B)No © 2014 WSSP C)Can not tell from data Question: Is this match biologically significant? A)Yes B)No © 2014 WSSP C)Can not tell from data Clicker Question: Is this match likely in a protein coding region? A) Yes B) No C) Can not tell from data © 2014 WSSP Clicker Question: What is the likely explanation for the gap? A) Sequence error in cDNA B) Error in making the cDNA C) Start of an intron region D) Can not tell from data © 2014 WSSP E) A, B or C Clicker Question: Is this match likely in a protein coding region? A) Yes B) No C) Can not tell from data © 2014 WSSP Fill in the table listing the best matches from three different organisms. List Landoltia if there is a match © 2014 WSSP p. 7-15 Use the clone report to obtain more information about the gene © 2014 WSSP p. 7-15 Is this a signific ant match? a) Yes b) No © 2014 WSSP p. 7-16 3) Perform a BLASTn of the est database Change the database © 2014 WSSP p. 7-17 BLASTn report of the EX1.14 search of the est database © 2014 WSSP p. 7-17 Alignment of the best match to EX1.13 from the est search >gi|198335694|gb|GD004539.1| CCHY28888.g1 CCHY Panicum virgatum callus (N) Panicum virgatum cDNA clone CCHY28888 3', mRNA sequence. Length=624 Score = 246 bits (272), Expect = 1e-61 Identities = 226/286 (79%), Gaps = 0/286 (0%) Strand=Plus/Minus Query 3 Sbjct 527 Query 63 Sbjct 467 Query 123 Sbjct 407 Query 183 Sbjct 347 Query 243 Sbjct 287 © 2014 WSSP GAGAGAAGATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGC |||| | ||| ||||||||| ||||| || || ||||| ||||||||| |||||||| GAGACACCATGCTGGAAGGGAAGGCGATGGTGGAGGACACGGACATGCCGGCGAAGATGC 62 AGGCGGAGGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCA ||||| |||| ||| || || || || ||||| | ||||||||| |||||| AGGCGCAGGCGATGGCGGCGGCGTCCAGGGCCCTCGACCGCTTCGACGTCCTCGACTGCC 122 AGAGCCTCGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGT |||| |||| ||||| ||||||||||| ||||| | |||| |||| || || ||||| | GGAGCATCGCGGCGCACATCAAGAAGGAGTTTGACACGATCCACGGCCCCGGGTGGCAAT 182 GCGTCGTCGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACT |||| || ||||||||||||||||| | |||||| |||| || || ||||||||||||| GCGTGGTGGGCTCCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATCTACT 242 TCCGCCTGGAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGC |||| || ||| ||||| ||||||||||||||||| ||||| || TCCGGCTCGAGTCGCTCAGGTTCCTCATCTTCAAAGGGGCGGCAGC 468 408 348 288 288 242 p. 7-17 Fill out the DSAP table of the BLASTn search of the est database © 2014 WSSP p. 7-18 Open Question: Why are there differences in the sequences? Query 61 Sbjct 13166 Query 121 Sbjct 13106 © 2014 WSSP CAAGGTCTAAGTACTGAAAAGGAAAGTCTACTAATTACAAAGAAGTTATTGTTTGTACCT |||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||| CAAGGTCTAAGTACTGAAAAGGAAAGTCCACTAATTACAAAGAAGTTATTGTTTGTACCT 120 TTTGTATCAGGGTTTATTAAATTTCAATCTTTATTGCTGAATCCCGAAACAAGGTGATCT |||||||||||||||||||||||| |||||| |||||||||||||||||||||||||||| TTTGTATCAGGGTTTATTAAATTTTAATCTTCATTGCTGAATCCCGAAACAAGGTGATCT 180 13107 13047 Q5. BLASTn Analysis: Is your cDNA similar to genes in other organisms? © 2014 WSSP p. 7-16 Q6. BLASTn Analysis: Is your cDNA similar to genes in different kingdoms? i.e. are there any matches to organisms from the eubacteria, archabacteria, protist, fungi, or animal kingdoms or are they all matches to other plants? © 2014 WSSP p. 7-16 Is the sequence found in many other organisms? ! © 2014 WSSP © 2014 WSSP