Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dynamic Programming How to match up sequences and have the matches make sense and be quantitative Question is • How does a specific sequence compare to one other specific sequence? – Is it similar? – If so, at what level? • Can’t compare every base to every other base--to complex You are in the driver’s seat • What is the most important? – Exact nucleotide match? – One-for-one (no gaps)? – Length Mathematical model • Derive equation for each position, based on your value system • Methodically go through each base for each sequence and calculate the value • At the end, find the optimal path Starting point: three possible scenarios for each position in sequences X and Y • At a given position, the bases (Xm and Yn) are identical in X and Y • At a given position, the base (Xm) in X is aligned with a gap in Y (and Yn appeared earlier) • At a given position, the base in Y is aligned with a gap in X (and Xm appeared earlier) Assign a value to each situation • Identical: +5 • Mismatch: -2 • Insertion or deletion: -6 (Could have others; could choose different values) http://www.acm.org/crossroads/xrds13-1/dna.html Alpha-glucosidase in plants: Enzymes sharing WIDMNE signature sequence alpha-glucosidase (all groups) alpha-xylosidase (plant, bacteria, archaea) Sucrase/Isomaltase (animal) Related sequences with broad substrate specificity At XYL1 Plantae Tm XYL Mj Aglu Fungi Pt Aglu Sp Aglu St MAL2 Anig aglA Pp BAB3946 An AgdA So Aglu Ca GAM1 Bv Aglu Soc GAM1 At Aglu-1 An agdB Hv Aglu Tp GAA Hs GAA Protista Cj GAAII Cj GAAI Ss xylS Archaea Hs S/I-N Hs S/I-C Bt Aglu-III Lv GAA Ce AAA8317 Bh BAB0442 Aa GlcA Animalia Sc CAB8890 Tm AAD3539 Lp XylQ Bacteria 0.1 Plant -amylases are located in different cellular compartments Plastids (chloroplasts, amyloplasts) Cytosol Apoplast (cell wall space) What is the function of the non-plastid forms? Clade I Secreted 421-445 aa Arabidopsis AMY1 barley A rice 2A barley B morning glory rice 3B dodder maize adzuki bean rice 3E rice XP_472377 Arabidopsis AMY2 apple 10 cassava apple 9 kiwifruit apple 8 plantain Clade III Plastidic 877-906 aa Arabidopsis AMY3 rice NP_916641 potato Clade II Cytosolic 407-414 aa Homologous sequences (homologues) Share a common ancestor Paralogs Homologues derived by gene duplication Functions may vary Look for differences Orthologs Homologues derived by speciation Common function Look for similarities Use alignments to look for: • Structures important for common functions (orthologs) • Structures important for unique functions (paralogs) • Unusual structures AMY1 has a three amino acid deletion N AtAMY1 3 C Barley -amylase Red: NHDTGST Blue: VAEIW Active site residues Variation in the active site loop among plant and bacterial -amylases AtAMY1