Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Computational Methods for Biomarker Discovery in Proteomics and Glycomics Vijetha Vemulapalli School of Informatics Indiana University Capstone Advisor: Dr. Haixu Tang What are Biomarkers? • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Substances present in increased or decreased amounts in body fluids or tissues that indicate exposure, disease or susceptibility to disease. Some Uses of Biomarkers • Problem Definition • Background • LC-MS • Method • Results Biomarkers are increasingly being used for the following purposes: • Prognosis / Diagnosis of disease • Monitoring response to medication • CE • Method • Results • Acknowledgements • References With high sensitivity and throughput, proteomics and glycomics is capable of identifying many potential biomarkers simultaneously. More on Biomarkers • Problem Definition • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Quantity • Background 10 9 8 7 6 5 4 3 2 1 0 Normal Diseased A B C D E F G H Substance I J A lot of times biomarkers have not been identified clearly. But based on the signature pattern of glycans and proteins, samples can be classified as healthy and diseased. What is Proteomics? • Problem Definition • Background • LC-MS • Method • Results • CE Proteins: A chain of amino acids including hormones, enzymes and antibodies. Proteome: All the proteins in a cell or bodily fluid at a given point of time under certain conditions. • Method • Results • Acknowledgements • References Proteomics: Proteomics is the study of proteins and proteomes using highthroughput technology. http://parasol.tamu.edu/groups/amatogroup/foldingserver/images/proteinL.gif http://biology.clc.uc.edu/graphics/bio104/cell.jpg What is Glycomics? • Problem Definition • Background • LC-MS Method • Results Glycoproteins: Proteins with attached polysaccharides . Glycans: Polysaccharide chain attached to a protein • CE • Method • Results • Acknowledgements • References Glycome: The entire set of glycans that are present in a cell or a bodily fluid at a certain point of time under certain conditions. Glycomics: Study of structure and function of oligosaccharides in a cell or organism. http://www.glyfdis.org/images/bg_image.jpg High Throughput Technologies to Identify Biomarkers • Problem Definition • Background • LC-MS • Method • Results Genome Scale Scanning Micro - arrays Transcriptome level • CE • Method Proteomics • Results • Acknowledgements • References Genome level Proteome level Glycomics Glycome level http://phy.asu.edu/phy598-bio/D4%20Notes%2006_files/image002.jpg Why the Focus on Proteomics and Glycomics? • Problem Definition Information content • Background • LC-MS • Method • Results • CE • Method • Results Genome Transcriptome Transcriptome Proteome • Acknowledgements • References Static Glycome Dynamic Biomarker Discovery using Proteomics Liquid Chromatography / Mass Spectrometry (LC/MS) • Problem Definition • Background • LC-MS Protein sample Liquid Chromatography Mass Spectrometry Data • Method • Results Why LC/MS for analysis of proteomes? • CE • Method • Results • Acknowledgements • References • LC spreads complexity of the sample over time. • MS identifies ions based on their mass/charge value. Software exists currently to identify proteins in a sample using data from a LC-MS experiment. Liquid Chromatography (LC) • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References http://wwwlb.aub.edu.lb/~webcrsl/high_p3.jpg Liquid Chromatography is a technique that separates ions or molecules dissolved in a solvent based on size of the ion/molecule, adsorption, ionexchange or other similar characteristics. What is Mass Spectrometer? • Problem Definition • Background Mass Spectrometry (MS) is an instrument that identifies ions based on their mass-to-charge ratio. • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Source: http://www.chemguide.co.uk/analysis/masspec/howitworks.html & http://www.bmms.uu.se/ltq-ft.htm Visualization of LC/MS Data : 2D Map • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References How Do We Find Biomarkers From LC-MS Data? • Problem Definition • Background Liquid Chromatography Protein sample Mass Spectrometry Data • LC-MS • Method • Results • CE Identified Proteins and Peptides Identification software • Method • Results • Acknowledgements • References MS View Quantities of peptides identified from the sample How Do We Find Biomarkers From LC-MS Data? Continued… • Problem Definition • LC-MS • Method Quantification 1 Sample 2 Quantification 2 Sample 3 Quantification 3 MSView • Results • CE • Method • Results • Acknowledgements • References Sample N Quantification N Analyze to find Biomarkers • Background Sample 1 MSView • Problem Definition MSView • Background • LC-MS • Method • Results Components Visualization • CE Relative Quantification • Method • Results • Acknowledgements • References Purpose Visual comparison /Analysis Further analysis for Biomarker Discovery Extracted Ion Chromatogram (XIC) • Problem Definition • Background • LC-MS Chromatogram created by plotting the intensity of the signal observed at a chosen m/z value in a series of mass spectra recorded as a function of retention time. • Method • Results • CE • Method • Results • Acknowledgements • References Source: http://www.lcpackings.com/applications/Probot/images/dual_fract04B.png Visualization: XIC • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Relative Quantification using Peptide Identification Results • Problem Definition Data from LC-MS experiment • Background Identification of peptides • LC-MS • Results • CE • Method • Results MS View • Method Extracted Ion Chromatogram of peptide Peak selection • Acknowledgements • References Area calculation Max Min Max Quantification: Peak Selection Algorithm • Problem Definition Selecting After Smoothing: peaks: Actual data: Selecting local maxima and minima • Background • LC-MS Minima Minima Maxima Maxima • Method • Results • CE • Method • Results • Acknowledgements • References Max Min Max Quantification: Sample Results • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Biomarker Discovery using Glycomics How does Capillary Electrophoresis (CE) work? • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References http://faculty.washington.edu/dovichi/UBUBTpage/research/Methods/CEintro/ceintro.GIF&imgrefurl=http://faculty.washington.edu/dovichi/UBUBTpage/research/Methods/CEintro/CE_LIF.html&h=531&w=6 84&sz=25&hl=en&start=3&um=1&tbnid=_JDf4X3dJn170M:&tbnh=108&tbnw=139&prev=/images%3Fq%3Dcapillary%2Belectrophoresis%26svnum%3D10%26um%3D1%26hl%3Den What does the data look like? • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Samples from different CE experiments: Biomarker Discovery using Glycomics – Overview • Problem Definition Data from different samples • Background • LC-MS • Results • CE • Method • Results CE Analyze • Method Mapping areas corresponding the same glycan from different samples Quantification of mapped peaks • Acknowledgements • References Analysis of quantification for identifying Biomarkers Direct Comparison: Dynamic Time Warping (DTW) • Problem Definition • Background • LC-MS • Method • Results DTW algorithm aligns two time series having similar curves but are skewed differently over time. • CE • Method • Results • Acknowledgements • References Time Source: http://db-www.aist-nara.ac.jp/theme/bioinfo_kenji-h_dtw.png Direct Comparison: DTW continued… • Background Sakoe-Chuba Band is used to reduce time & space complexity. • LC-MS Parameters used in DTW: • Problem Definition • Method • Results - Band width - Peak extention penalty - Difference in peak intensities. • CE • Method - Difference in peak direction • Results • Acknowledgements • References Stan Aslvador and Philip Chan. FastDTW:Toward Accurate Dynamic Time Warping in Linear Time and Space, KDD Workshop on Mining Temporal and Sequential Data, 2004 Method: Dynamic Time Warping • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Align to consensus sample Consensus Align next sample to Sample consensus sample Method continued… • Problem Definition • Background Unaligned sample • LC-MS • Method Corresponding peaks Aligned sample • Results Corresponding peaks • CE • Method • Results • Acknowledgements • References Calculate Area Peak 1 Results • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Corresponding peaks Summary • Problem Definition • Background • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References Proteomics - MSView LC-MS data Identified Peptides Quantification results for Biomarker Discovery Glycomics - CE Analyze CE Data Quantification results for Biomarker Discovery Acknowledgements • Problem Definition • Background • LC-MS • Method • Results • CE Dr. Haixu Tang - My advisor Dr. Randy J.Arnold Dr. Milos Novotny Dr. Sun Kim Dr. Stephen J. Valentine Dr. Yehia Mechref Dr. David E.Clemmer Dr. Jeong-Hyeon Choi Yin Wu Manolo D.Plasencia • Method • Results • Acknowledgements • References School of Informatics Funding: NIH/NCRR MetaCyt Initiative @ Indiana University References • Problem Definition [1] Higgs, R.E., Knierman, M.D., Gelfanova, V., Butle,r J.P. and Hale, • Background J.E. (2005) Comprehensive label-free method for the relative quantification of proteins from biological samples. J. Proteome Res., 4, 1442-1450. [2] Linsen, L., Locherbach, J., Berth, M., Becher, D. and Bernhardy, J. (2006) Visual Analysis of Gel-Free Proteome Data. IEEE Transactions on Visualization and Computer Graphics,12, 497-508. [3] Prakash, A., Mallick, P., Whiteaker, J., Zhang, H., Paulovich, A., Flory, M., Lee, H., Aebersold, R., and Schwikowski, B. (2006) Signal maps for mass spectrometry-based comparative proteomics. Mol. Cell. Proteomics 5, 423 –432 [4] Leptos, K. C., Sarracino, D. A., Jaffe, J. D., Krastins, B., and Church, G. M. (2006) MapQuant: open-source software for large-scale protein quantification. Proteomics 6, 1770 –1782 [5] Aebersold, R., and Mann, M. (2003) Mass spectrometry-based proteomics. Nature 422, 198 –207 • LC-MS • Method • Results • CE • Method • Results • Acknowledgements • References