Download Extracting quantitative information from

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein phosphorylation wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Protein moonlighting wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

JADE1 wikipedia , lookup

Gel electrophoresis wikipedia , lookup

Western blot wikipedia , lookup

Transcript
Extracting quantitative information
from proteomic 2-D gels
Lecture in the bioinformatics course ”Gene expression and cell models”
April 20, 2005
John Gustafsson
Mathematical Statistics
Chalmers
Proteomics lectures:
starting points
• Anders’ starting point this Monday:
– Let’s say that we want to study life at the protein
level – what technologies do we have at hand?
• Today’s lecture:
– How can we get (large-scale) quantitative
measurements of protein amounts? So that we
can do statistics and bioinformatics
Content and structure
• Proteomics
• The 2-D gel technology
• Extracting quantitative information
– Image analysis of 2-D gels
• Comparison with microarrays
• Statistic analysis of quantitative 2-D gel data
Proteomics
DNA
mRNA
2-D gels
P
Modification
Production
Co-factors
Degradation
Localisation
Interaction
TDP
ACTIVITY
2-D gel electrophoresis:
Protein separation and quantification
molecular charge
small
molecular size
”protein soup”
alkaline
large
acidic
spot volume  protein quantity
A typical 2-D gel experiment
experimental design
Example:
biological experiment
control
treatment
protein extracts
2-D gel electrophoresis
2-D gel images
image analysis
quantified data
statistical analysis
conclusions
 z111  z115 z121  z125 
z


z
z

z
211
215
221
225


 


 




 
 
 


 








z

 m11  zm15 zm 21  zm 25 
matrix with
spot volume
data
rows: proteins
(many)
columns: gels
(few)
The image analysis task
•
The task
1. In each gel image: Find and quantify the protein
spots
2. In the group of gel images: Match protein spots
in different images that correspond to the same
protein
•
Issues
– automation
– time
Pseudo-color superposition 1(3)
0M NaCl
1M NaCl
Pseudo-color superposition 2(3)
OM NaCl
1M NaCl
Pseudo-color superposition 3(3)
(red: 0M NaCl, blue: 1M NaCl)
The standard solution
– workflow
In each gel image
1. Background subtraction
2. Spot detection
3. Spot quantification
In the group of gel images
4. Spot pattern matching
1. Background subtraction
Before
After
-
=
2. Spot detection /
image segmentation
3. Spot quantification
spot volume  protein quantity
4. Spot pattern matching
The typical 2-D gel experiment
experimental design
Example:
biological experiment
control
treatment
protein extracts
2-D gel electrophoresis
2-D gel images
image analysis
quantified data
statistical analysis
conclusions
 z111  z115 z121  z125 
z


z
z

z
211
215
221
225


 


 




 
 
 


 








z

 m11  zm15 zm 21  zm 25 
matrix with
spot volume
data
rows: proteins
(many)
columns: gels
(few)
Limitations
• Technological
– hydrofobic proteins don’t
dissolve
– limited pI/size coverage
– limited labeling/staining
• Image analytical
– Limited global matching
efficiency of automatic
algorithms
– Need for time consuming
manual guidance
– ”The image analysis
bottle-neck”
Limited global matching
efficiency
Voss and Haberl (2000)
Incomplete spot detection:
Faint spots
Detected
Not detected
Incomplete spot detection:
Close spots
Content and structure
– revisited
• Proteomics
• The 2-D gel technology
• Extracting quantitative information
– Image analysis of 2-D gels
• Comparison with microarrays
• Statistic analysis of quantitative 2-D gel data
Comparison with microarrays
2-D gels
Microarrays
one channel*
one or two-color
yes
yes
HARD
easy
can be difficult
quite easy
HARD
known
MS or reference atlas
known
Labeling
Background subtr.
Spot detection
Spot quantitation
Spot matching
Identification
*) recently also two-color
Variability
growth condition
1M NaCl
biological replications
normal
normal 1M NaCl
Variance versus mean
dependence
• A dot in the plot:
– the measurement of one
protein
slope=2  variance  mean2
• The quadratic dependence
indicates a multiplicative
error structure
(2x5 gel set; normal growth condition)
Why transform the data?
• A mathematical data transformation can be
used to
– Make errors more normally distributed
– Stabilize variance versus mean dependence
• Then the model on transformed scale is more
simple than on original scale
• Simplifies the subsequent analysis
Logarithmic data
transformation
• Stabilized variance
versus mean
dependence after a
logarithmic data
transformation
(2x5 gel set; normal growth condition)
Statistical analysis of
quantitative 2-D gel data
Examples:
• Test of differential expression
• Cluster analysis
– cluster proteins
– cluster cell/tissue samples
• Classification
– classify tissue samples (i.e. tumor classes)
Summary
• Proteomics
• The 2-D gel technology
• Extracting quantitative information
– Image analysis of 2-D gels
• Comparison with microarrays
• Statistic analysis of quantitative 2-D gel data
An alternative approach to the
matching problem
• The standard solution
– First spot detection
– Then matching of point patterns
• An alternative, recent approach
– Matching at the pixel level
– Computationally heavy
Gel matching at the pixel level
Original image Aligned image Reference image
Image warping
Future alternatives to
quantitative 2-D gels?
• Quantitative masspectrometry
• Protein arrays