Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Part II : Sequence Analysis Paul Tan Thiam Joo [email protected] Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research What is sequence analysis? • Nucleic acids: DNA and RNA • Proteins: amino acid composition, pI, molecular weight, hydrophobicity. 1 Why do sequence analysis? • Assessing potential allergenicity (Gendel, 2002) • Parkinson's disease (neurodegenerative) (Tversky and Fink, 2002) • Human Genome Project (completed in 2001) Sequence analysis of proteins • • • • Backtranslation Amino acid composition Molecular weights, pIs Hydropathy profile 2 http://kr.expasy.org Backtranslation • Protein -> DNA • Use for cloning protein of interest where it may be present in low amount. • Beware of codon bias and degeneracy of codons. 3 UUU-Phe UUC-Phe UUA-Leu UUG-Leu UCU-Ser UCC-Ser UCA-Ser UCG-Ser UAU-Tyr UAC-Tyr UAA-Stop UAG-Stop UGU-Cys UGC-Cys UGA-Stop UGG-Trp CUU-Leu CUC-Leu CUA-Leu CUG-Leu CCU-Pro CCC-Pro CCA-Pro CCG-Pro CAU-His CAC-His CAA-Gln CAG-Gln CGU-Arg CGC-Arg CGA-Arg CGG-Arg AUU-Ile AUC-Ile AUA-Ile AUG-Met ACU-Thr ACC-Thr ACA-Thr ACG-Thr AAU-Asn AAC-Asn AAA-Lys AAG-Lys AGU-Ser AGC-Ser AGA-Arg AGG-Arg GUU-Val GUC-Val GUA-Val GUG-Val GCU-Ala GCC-Ala GCA-Ala GCG-Ala GAU-Asp GAC-Asp GAA-Glu GAG-Glu GGU-Gly GGC-Gly GGA-Gly GGG-Gly Biased codon usage Amino acid Leu Val Codon UUA UUG CUU CUC CUA CUG GUU GUC GUA GUG Bacteria Yeast Fruit Fly Human Preferred Preferred Preferred Preferred Preferred Preferred Preferred Preferred 4 Amino Acid Composition • Determine the percentages of amino acid residues present in a protein molecule. • Uses: – determine the lifestyles of organisms: high percentages of Glutamate (- charge) and both Lysine and Arginine (+ charge) in hyperthermophiles vs. mesophiles -> absent (Tekaia et al., 2002). – predict structural class (Luo et al., 2002). Nonpolar amino acids (FILMWAV) 5 Polar uncharged (S-Q+T-N+Y-) Polar charged (KHERD) 6 Unique Properties Protein functions from specific residues • • • • • C G H KR P Disulphide-rich, zinc fingers Collagens Histidine-rich glycoprotein Nuclear proteins, nuclear localisation Collagen, filaments 7 8 Molecular weights, pIs • Aid in designing of purification experiments e.g. SDS-PAGE, IEF, 2-Dimensional Gel, Column chromatography etc. Hydropathy Profiles • Hydropathy - describe the hydrophobicity and hydrophilicity of a protein sequence. • A graph in which hydropathy values are calculated within a sliding window and plotted for each residue in a protein sequence. 9 A sliding window M K F F L M C L I I F P I M G V L G Signal region Alpha-helix 10 Alpha-helix Beta-sheet Alpha-helix A schematic representation of a 3-D structure of a scorpion toxin Hydropathy Profiles • Hydropathy scale - each amino acid is assigned a value reflecting its relative hydrophobicity and hydrophilicity. • 2 broad classes of scales: – Environmental characteristics of protein residues. – Experimental measurements of amino acid physiochemical properties. 11 Venn Diagram of the 20 amino acid physiochemical properties Hydropathy Profiles • Basic ranking: internal {FILMV}, external {DEHKNQR}, ambivalent {ACGPSTWY} 12 Hydropathy Profiles • Detect possible transmembrane domains (consecutive 20-25 runs of hydrophobic amino acids). • Hydrophobic protein cores • Predict neurotoxicity in snake Phospholipases A2 (Kini and Iwanaga, 1986) References • • • • • • • • Kini RM, Iwanaga S. (1986) Toxicon 24(6):527-541. Rehm BH. (2001) Appl Microbiol Biotechnol. 57(5-6):579-92. Weir M, Swindells M, OveringTon J. (2001) Trends Biotechnol 19(10 Suppl):S61-6. Gendel S. M. (2002) Ann. N.Y. Acad. Sci. 964: 87–98. Luo RY, Feng ZP, Liu JK. (2002) Eur J Biochem 2002 269(17):42194225 Tekaia F, Yeramian E, Dujon B. (2002) Gene 297:51-60. Tversky VN, Fink AL. (2002) FEBS Lett 522(1-3):9-13. EXPASY http://cn.expasy.org/ 13