Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Sense-Antisense Proteins Vision Lab Presentation Ruchir Shah April 16, 2003 Sense-Antisense Proteins * Peptides generated from sense and antisense DNA strands have ‘inverted hydropathies’. Although it makes no sense, it is hypothesized that S- and ASpeptides could have a high binding affinity for each other. Picture adapted from: J.R.Heal et al; ChemBioChem 2002,3,136-151. S-AS Codon Table Inverted Hydropathy Blue=Non Polar Pink=Polar Picture adapted from: J.R.Heal et al; ChemBioChem 2002,3,136-151. S-AS Codons •Degeneracy: One sense AA can have more than One antisense AA. •Hydropathy: Sense & antisense AA’s have inverted hydropathy. •Codon biases/codon frequencies? •Sense proteins interact with Antisense proteins: Numerous experimental evidences suggest that Sense and AS peptide have specific binding Affinity. Experimental evidences Picture adapted from: J.R.Heal et al; ChemBioChem 2002,3,136-151. How do S-AS Amino Acids interact? Picture adapted from: J.R.Heal et al; ChemBioChem 2002,3,136-151. Molecular Recognition Theory Picture adapted from: J.R.Heal et al; ChemBioChem 2002,3,136-151. Tasks • Literature says: – S-AS proteins exist – S-AS proteins interact specifically with each other! • Tasks: – Look for S-AS protein pairs(how such many pairs exist?) – What are the biological implications? – Do they really interact? How to find S-AS pairs from Sequence Db? • Conventional Sequence identity tools can be used to find out ‘similar’ proteins. Example: Blast or Smith Waterman with a choice of substitution matrix Positive score for Identity or desirable substitutions. Negative score for undesirable substitutions. BLOSUM 62 Source: http://www.blc.arizona.edu/courses/bioinformatics/blosum.html Design of a new substitution matrix • To find S-AS pairs using existing sequence identity tools I need a special matrix. New matrix should: - positively score S-AS pairs - negatively score other pairs - reflect the degeneracy of genetic code - average score should be negative to avoid false positives!! S-AS Codon Table Results: What does it look like It works!! Results: contd.. Low complexity regions! Lots of ‘small’ hits(lessons learnt!) “get rid of noise/background” “get rid of Low complexity regions” “use a better matrix” Design of a new substitution matrix New matrix should: - positively score S-AS pairs - negatively score other pairs - reflect the degeneracy of genetic code -take into account the codon biases S-AS Codon Table Codon AmAcid /1000 5'Sense3' Sense GGG Gly 5.98 GGA Gly 10.92 GGT Gly 23.9 GGC Gly 9.71 Freq. Sense 0.00598 0.01092 0.0239 0.00971 0.05051 Codon 5 AS 3' CCC TCC ACC GCC AA Anti S Pro Ser Thr Ala /1000 6.78 14.22 12.56 12.54 Freq. Anti S 0.00678 0.01422 0.01256 0.01254 GAG GAA Glu Glu 19.14 45.92 0.01914 0.04592 0.06506 CTC TTC Leu Phe 5.38 18.21 0.00538 0.01821 GAT GAC Asp Asp 37.84 20.26 0.03784 0.02026 0.0581 ATC GTC Ile Val 17.07 11.59 0.01707 0.01159 GTG GTA GTT GTC Val Val Val Val 10.66 11.78 22.01 11.59 0.01066 0.01178 0.02201 0.01159 0.05604 CAC TAC AAC GAC His Tyr Asn Asp 7.77 14.67 24.94 20.26 0.00777 0.01467 0.02494 0.02026 GCG GCA GCT GCC Ala Ala Ala Ala 6.15 16.16 21.09 12.54 0.00615 0.01616 0.02109 0.01254 0.05594 CGC TGC AGC GGC Arg Cys Ser Gly 2.58 4.67 9.68 9.71 0.00258 0.00467 0.00968 0.00971 0.00924 0.0213 0.00173 0.00301 0.00648 0.00258 0.04434 CCT TCT CCG TCG ACG GCG AGG AGA CGG CGA CGT CGC Arg Arg Arg Arg Arg Arg 9.24 21.3 1.73 3.01 6.48 2.58 Pro Ser Pro Ser Thr Ala 13.58 23.55 5.27 8.56 7.95 6.15 0.01358 0.02355 0.00527 0.00856 0.00795 0.00615 Source: SGD(Stanford) Saccharomyces Genome Database 1. Low complexity filter : SEG 2. More meaningful Matrix: Formula for new scoring scheme Flow Chart Sequence database (Yeast) ~6000prtns Run Smith Waterman All against All With new matrix Look for ‘hits’ Compare it with Interaction data Tasks • Look for sense-antisense protein pairs in protein sequence databases. • List all sense-antisense pairs • Compare the list with List of interacting proteins. Example: Sense-Antisense pairs P1-P101 P2-P102 P3-P103 P4-P104 Database of Interacting Prtns P5-P99 P2-P102 P104-P4 Tasks • Look for sense-antisense protein pairs in protein sequence databases. • List all sense-antisense pairs • Compare the list with List of interacting proteins. Example: Sense-Antisense pairs P1-P101 P2-P102 P3-P103 P4-P104 Database of Interacting Prtns P5-P99 P2-P102 P104-P4 DIP : Database of Interacting Proteins http://dip.doe-mbi.ucla.edu/dip/Main.cgi SS=small scale experiment HT=high throughput exp. Purple=overlap Bars= more than 1 exp. Proteins = 4727 Interactions= 15174 Work in Progress •Statistics of alignment: Distinguish random from meaningful hits! •Relative entropy of the matrix •Gap Penalties Acknowledgments Todd Vision (Biology) Alex Tropsha (Pharmacy) Dr. Falk (Nephrology) All of my lab mates.