Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg tgatgagtat ccggaaatag acttcaatga ttgattaata ttggattgaa gatacagttt gatcccgatc ttggttctaa tttccatttc attatcttgc tccgtattaa atgattgctt gcattcgaat tgtcccagtt atgagccagc taacgaacgg caatattttc gcgtacccgt tttaattttc © 2014 WSSP 8-3 Why do a BLASTX if we have done a BLASTN? BLASTN Match Query AGG TCG TTA CTA TCG AGG AGT AGA | | | Sbjct CGT AGC CTT TTG AGT CGA TCG CGG 16% Identity R S L L S R S Query AGG TCG TTA CTA TCG AGG AGT AGA | | | Sbjct CGT AGC CTT TTG AGT CGA TCG CGG R S L L S R S R S L L S R S R | | | | | | | | R S L L S R S R © 2014 WSSP | R | R 100% Identity 8-1 Clicker Question #1: How many different DNA sequences can code for the peptide sequence Met-Leu-Cys-Ala? A) 1 B) 12 C) 36 D) 48 E) 54 3 Letter NAME Abbreviation Alanine Ala Cysteine Cys Histidine His Isoleucine Ile Lysine Lys Leucine Leu Methionine Met Asparagine Asn Proline Pro Glutamine Gln 1 Letter Abbreviation A C H I K L M N P Q DNA codons for each Amino Acids GCA,GCC,GCG,GCU UGC,UGU CAC,CAU AUA,AUC,AUU AAA,AAG UUA,UUG,CUA,CUC,CUG,CUU AUG AAC,AAU CCA,CCC,CCG,CCU CAA,CAG Number of codons for the conserved region of the protein 3422242262422412263622666262232446622244246216 ITKNYPYYRTADKGWQNSIRHNLSLNRYFIKVPRSQEEPGKGSFWR 3 Letter 1 Letter Abbreviation Abbreviation Alanine Ala A Cysteine Cys C Aspartic Acid Asp D Glutamic Acid Glu E Phenylalanine Phe F Glycine Gly G Histidine His H Isoleucine Ile I Lysine Lys K Leucine Leu L Methionine Met M Asparagine Asn N Proline Pro P Glutamine Gln Q Arginine Arg R Serine Ser S Threonine Thr T Valine Val V Tryptophan Trp W Tyrosine Tyr Y Stop Codons . © 2014 WSSP DNA codons for each Amino Acids NAME GCA,GCC,GCG,GCU UGC,UGU GAC,GAU GAA,GAG UUC,UUU GGA,GGC,GGG,GGU CAC,CAU AUA,AUC,AUU AAA,AAG UUA,UUG,CUA,CUC,CUG,CUU AUG AAC,AAU CCA,CCC,CCG,CCU CAA,CAG CGA,CGC,CGG,CGU,AGA,AGG UCA,UCC,UCG,UCU,AGC,AGU ACA,ACC,ACG,ACU GUA,GUC,GUG,GUU UGG UAC,UAU UAA,UAG,UGA 7.5 x 1019 © 2014 WSSP 8-2 © 2014 WSSP p. 8-2 DSAP BLASTx Page NCBI BLASTx page © 2014 WSSP Cropped DNA sequence p. 7-2 BLASTX Dialog Box © 2014 WSSP p 8-3 BLASTX of EX1.14 © 2014 WSSP p 8-3 BLASTn and BLASTx of another Landoltia sequence BLASTn BLASTx © 2014 WSSP p 8-4 List of EX1.14 BLASTx matches © 2014 WSSP p 8-4 Best BLASTx alignment for EX1.14 © 2014 WSSP p 8-5 Low Sequence Complexity Filter >gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis] Score = 69.7 bits (169), Expect = 3e-10 Identities = 54/174 (31%), Positives = 85/174 (48%), Gaps = 4/174 (2%) With Filter Query 40 Sbjct 9 Query 220 Sbjct 68 Query 400 Sbjct 128 LTCLLILQAPSSHAFYLWppfffpspvpDVITVLNQANQFTTLVQLLTETGVATAVNAIS LT L++L + + A P PS +V +L++ QFTT ++LLT T VAT + LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q 219 TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP N G T+FAPTD AF + A L+ L+ Q+ ++ H +FYT + L + + LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR 399 TLNT---GrsltfstsvtrvttitsPGGRVTTLNFLLYRRFPLTIFPIADVLLP T T G + + S G T +N + ++FPL ++ + VLLP TQATGQDGGVFGLNFTGQANQVNVSTGIVETQINNAIRQQFPLALYQVDKVLLP 67 127 552 181 >gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis] Score = 82.4 bits (202), Expect = 4e-14 Identities = 57/176 (32%), Positives = 89/176 (50%), Gaps = 8/176 (4%) Frame = +1 Without Filter © 2014 WSSP Query 40 Sbjct 9 Query 220 Sbjct 68 Query 400 Sbjct 128 LTCLLILQAPSSHAFYLWPPFFFPSPVPDVITVLNQANQFTTLVQLLTETGVATAVNAIS LT L++L + + A P PS +V +L++ QFTT ++LLT T VAT + LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q 219 TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP N G T+FAPTD AF + A L+ L+ Q+ ++ H +FYT + L + + LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR 399 TLNTGR-----SLTFSTSVTRVTTITSPGGRVTTLNFLLYRRFPLTIFPIADVLLP T TG+ L F+ +V S G T +N + ++FPL ++ + VLLP TQATGQDGGVFGLNFTGQANQVN--VSTGIVETQINNAIRQQFPLALYQVDKVLLP 552 181 67 127 Answer questions in DSAP © 2014 2013 WSSP p 8-6 Question: Which of these alignments has a greater biological significance? A) B) © 2014 WSSP What can you conclude about this BLASTX result? A) It is too short to be significant © 2014 WSSP B) It does not match anything C) There is a frame shift in the DNA sequence D) Your DNA has an exact match Where is the frameshift most likely to be found? A) bp 181 B) Bp 75 C) bp 227 D) bp 381 E) Can not tell from the data © 2014 WSSP Points at when an error can be introduced into the DNA sequence of the clone DNA RNA cDNA AAAAAAAA TTTTTTTTT DS-cDNA AAAAAAAA TTTTTTTTT Cloning Replication & Purification Sequencing © 2014 WSSP AAAAAAAA AAAAAAAA TTTTTTTTT Is the frame shift at bp 227 caused by a DNA sequencing error? A) Yes B) No C) Can not tell from the data © 2014 WSSP Does this have a frame shift? Where? © 2014 WSSP What does this BLASTX report indicate? A) There are matches to different proteins at the end of the sequence B) There are matches in one frame to the entire sequence C) There is a frame shift in the DNA sequence D) The protein has two different domains E) Can not conclude anything Where is the frame shift? © 2014 WSSP A) B) C) D) E) bp 149 Bp 160 bp 458 bp 469 bp 493 Does this indicate that there is a frame shift in the sequence? A) Yes B) No C) Can not tell +1 +3 +1 from the data +1 © 2014 WSSP Intron +3 What is the most likely explanation for this result? A) There is nothing wrong with the alignment. B) There is an extra or missing base causing a frame shift. C) There is an unspliced intron in the cDNA. D) The query has an extra protein region. E) Answers C or D © 2014 WSSP