Download MSView: A Bioinformatics toolkit for visualization

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Computational Methods for
Biomarker Discovery in Proteomics
and Glycomics
Vijetha Vemulapalli
School of Informatics
Indiana University
Capstone Advisor: Dr. Haixu Tang
What are Biomarkers?
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
 Substances present in increased or decreased
amounts in body fluids or tissues that indicate
exposure, disease or susceptibility to disease.
Some Uses of Biomarkers
• Problem Definition
• Background
• LC-MS
• Method
• Results
 Biomarkers are increasingly being used for the
following purposes:
• Prognosis / Diagnosis of disease
• Monitoring response to medication
• CE
• Method
• Results
• Acknowledgements
• References
 With high sensitivity and throughput, proteomics
and glycomics is capable of identifying many potential
biomarkers simultaneously.
More on Biomarkers
• Problem Definition
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Quantity
• Background
10
9
8
7
6
5
4
3
2
1
0
Normal
Diseased
A
B
C
D E F G H
Substance
I
J
 A lot of times biomarkers have not been identified
clearly. But based on the signature pattern of glycans
and proteins, samples can be classified as healthy and
diseased.
What is Proteomics?
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
 Proteins: A chain of amino acids
including hormones, enzymes and
antibodies.
 Proteome: All the proteins in a cell or
bodily fluid at a given point of time under
certain conditions.
• Method
• Results
• Acknowledgements
• References
Proteomics: Proteomics is the study of
proteins and proteomes using highthroughput technology.
http://parasol.tamu.edu/groups/amatogroup/foldingserver/images/proteinL.gif
http://biology.clc.uc.edu/graphics/bio104/cell.jpg
What is Glycomics?
• Problem Definition
• Background
• LC-MS Method
• Results
 Glycoproteins: Proteins with
attached polysaccharides .
 Glycans: Polysaccharide chain
attached to a protein
• CE
• Method
• Results
• Acknowledgements
• References
 Glycome: The entire set of
glycans that are present in a cell
or a bodily fluid at a certain point
of time under certain conditions.
 Glycomics: Study of structure and function of
oligosaccharides in a cell or organism.
http://www.glyfdis.org/images/bg_image.jpg
High Throughput Technologies to
Identify Biomarkers
• Problem Definition
• Background
• LC-MS
• Method
• Results
Genome Scale
Scanning
Micro - arrays
Transcriptome
level
• CE
• Method
Proteomics
• Results
• Acknowledgements
• References
Genome
level
Proteome
level
Glycomics
Glycome
level
http://phy.asu.edu/phy598-bio/D4%20Notes%2006_files/image002.jpg
Why the Focus on Proteomics and
Glycomics?
• Problem Definition
 Information content
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
Genome
Transcriptome
Transcriptome
Proteome
• Acknowledgements
• References
Static
Glycome
Dynamic
Biomarker Discovery using
Proteomics
Liquid Chromatography / Mass
Spectrometry (LC/MS)
• Problem Definition
• Background
• LC-MS
Protein
sample
Liquid
Chromatography
Mass
Spectrometry
Data
• Method
• Results
 Why LC/MS for analysis of proteomes?
• CE
• Method
• Results
• Acknowledgements
• References
• LC spreads complexity of the sample over time.
• MS identifies ions based on their mass/charge value.
 Software exists currently to identify proteins in a
sample using data from a LC-MS experiment.
Liquid Chromatography (LC)
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
http://wwwlb.aub.edu.lb/~webcrsl/high_p3.jpg
 Liquid Chromatography is a technique that
separates ions or molecules dissolved in a solvent
based on size of the ion/molecule, adsorption, ionexchange or other similar characteristics.
What is Mass Spectrometer?
• Problem Definition
• Background
 Mass Spectrometry (MS) is an instrument that
identifies ions based on their mass-to-charge ratio.
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Source: http://www.chemguide.co.uk/analysis/masspec/howitworks.html & http://www.bmms.uu.se/ltq-ft.htm
Visualization of LC/MS Data : 2D
Map
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
How Do We Find Biomarkers From
LC-MS Data?
• Problem Definition
• Background
Liquid
Chromatography
Protein
sample
Mass
Spectrometry
Data
• LC-MS
• Method
• Results
• CE
Identified Proteins
and Peptides
Identification
software
• Method
• Results
• Acknowledgements
• References
MS View
Quantities of peptides identified from the
sample
How Do We Find Biomarkers From
LC-MS Data? Continued…
• Problem Definition
• LC-MS
• Method
Quantification 1
Sample 2
Quantification 2
Sample 3
Quantification 3
MSView
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Sample N
Quantification N
Analyze to find Biomarkers
• Background
Sample 1
MSView
• Problem Definition
MSView
• Background
• LC-MS
• Method
• Results
Components
Visualization
• CE
Relative
Quantification
• Method
• Results
• Acknowledgements
• References
Purpose
Visual comparison
/Analysis
Further
analysis for
Biomarker
Discovery
Extracted Ion Chromatogram (XIC)
• Problem Definition
• Background
• LC-MS
 Chromatogram created by plotting the intensity of
the signal observed at a chosen m/z value in a series of
mass spectra recorded as a function of retention time.
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Source: http://www.lcpackings.com/applications/Probot/images/dual_fract04B.png
Visualization: XIC
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Relative Quantification using Peptide
Identification Results
• Problem Definition
Data from LC-MS
experiment
• Background
Identification of
peptides
• LC-MS
• Results
• CE
• Method
• Results
MS View
• Method
Extracted Ion Chromatogram of peptide
Peak selection
• Acknowledgements
• References
Area calculation
Max  Min   Max
Quantification: Peak Selection
Algorithm
• Problem Definition
Selecting
After
Smoothing:
peaks:
Actual
data:
Selecting
local maxima and minima
• Background
• LC-MS
Minima
Minima
Maxima
Maxima
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Max  Min   Max
Quantification: Sample Results
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Biomarker Discovery using
Glycomics
How does Capillary Electrophoresis
(CE) work?
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
http://faculty.washington.edu/dovichi/UBUBTpage/research/Methods/CEintro/ceintro.GIF&imgrefurl=http://faculty.washington.edu/dovichi/UBUBTpage/research/Methods/CEintro/CE_LIF.html&h=531&w=6
84&sz=25&hl=en&start=3&um=1&tbnid=_JDf4X3dJn170M:&tbnh=108&tbnw=139&prev=/images%3Fq%3Dcapillary%2Belectrophoresis%26svnum%3D10%26um%3D1%26hl%3Den
What does the data look like?
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Samples from different CE experiments:
Biomarker Discovery using Glycomics
– Overview
• Problem Definition
Data from different samples
• Background
• LC-MS
• Results
• CE
• Method
• Results
CE Analyze
• Method
Mapping areas corresponding the same glycan
from different samples
Quantification of mapped peaks
• Acknowledgements
• References
Analysis of quantification for identifying Biomarkers
Direct Comparison: Dynamic Time
Warping (DTW)
• Problem Definition
• Background
• LC-MS
• Method
• Results
 DTW
algorithm
aligns two time series
having similar curves
but
are
skewed
differently over time.
• CE
• Method
• Results
• Acknowledgements
• References
Time
Source: http://db-www.aist-nara.ac.jp/theme/bioinfo_kenji-h_dtw.png
Direct Comparison: DTW continued…
• Background
 Sakoe-Chuba Band is used to reduce time & space
complexity.
• LC-MS
 Parameters used in DTW:
• Problem Definition
• Method
• Results
- Band width
- Peak extention penalty
- Difference in peak intensities.
• CE
• Method
- Difference in peak direction
• Results
• Acknowledgements
• References
Stan Aslvador and Philip Chan. FastDTW:Toward Accurate Dynamic Time Warping in Linear Time and Space, KDD Workshop on Mining
Temporal and Sequential Data, 2004
Method: Dynamic Time Warping
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Align
to consensus sample
Consensus
Align next sample to
Sample
consensus sample
Method continued…
• Problem Definition
• Background
Unaligned sample
• LC-MS
• Method
Corresponding peaks
Aligned sample
• Results
Corresponding peaks
• CE
• Method
• Results
• Acknowledgements
• References
Calculate Area
Peak 1
Results
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Corresponding peaks
Summary
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
 Proteomics - MSView
LC-MS data
Identified Peptides
Quantification results for
Biomarker Discovery
Glycomics - CE Analyze
CE Data
Quantification results for
Biomarker Discovery
Acknowledgements
• Problem Definition
• Background
• LC-MS
• Method
• Results
• CE
Dr. Haixu Tang
- My advisor
Dr. Randy J.Arnold
Dr. Milos Novotny
Dr. Sun Kim
Dr. Stephen J. Valentine
Dr. Yehia Mechref
Dr. David E.Clemmer
Dr. Jeong-Hyeon Choi
Yin Wu
Manolo D.Plasencia
• Method
• Results
• Acknowledgements
• References
School of Informatics
Funding:
NIH/NCRR
MetaCyt Initiative @ Indiana University
References
• Problem Definition
[1] Higgs, R.E., Knierman, M.D., Gelfanova, V., Butle,r J.P. and Hale,
• Background
J.E. (2005) Comprehensive label-free method for the relative
quantification of proteins from biological samples. J. Proteome Res., 4,
1442-1450.
[2] Linsen, L., Locherbach, J., Berth, M., Becher, D. and Bernhardy, J.
(2006) Visual Analysis of Gel-Free Proteome Data. IEEE Transactions on
Visualization and Computer Graphics,12, 497-508.
[3] Prakash, A., Mallick, P., Whiteaker, J., Zhang, H., Paulovich, A.,
Flory, M., Lee, H., Aebersold, R., and Schwikowski, B. (2006) Signal
maps for mass spectrometry-based comparative proteomics. Mol. Cell.
Proteomics 5, 423 –432
[4] Leptos, K. C., Sarracino, D. A., Jaffe, J. D., Krastins, B., and Church,
G. M. (2006) MapQuant: open-source software for large-scale protein
quantification. Proteomics 6, 1770 –1782
[5] Aebersold, R., and Mann, M. (2003) Mass spectrometry-based
proteomics. Nature 422, 198 –207
• LC-MS
• Method
• Results
• CE
• Method
• Results
• Acknowledgements
• References
Related documents