Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vectors in gene therapy wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Point mutation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Protein moonlighting wikipedia , lookup

NEDD9 wikipedia , lookup

Transcript
EECS 730
Introduction to Bioinformatics
Introduction to Proteomics
Luke Huan
Electrical Engineering and Computer Science
http://people.eecs.ku.edu/~jhuan/
5/6/2017
1
Proteome: Protein complement of a genome


Time- and cell- specific protein complement of the
genome.
Encompasses all proteins expressed in a cell at one
time, including isoforms and post-translational
modifications.
5/6/2017
2
Proteome



Contrast to genome
 The genome is constant for one cell and identical for all cells of an
organism, and does not change very much within a species
 The proteome is very dynamic with time and in response to
external factors, and differs substantially between cell types.
Variable
 In different cell and tissue types in same organism
 In different growth and developmental stages of organism
Dynamic
 Depends on response of genome to environmental factors




5/6/2017
Disease state
Drug challenge
Growth conditions
Stress
3
Introduction to proteomics



Proteomics is the study of total protein complements, proteomes,
e.g. from a given tissue or cell type.
 Don’t forget that the proteome is dynamic, changing to reflect the
environment that the cell is in
Definitions
 Classical - restricted to large scale analysis of gene products
involving only proteins
 Inclusive - combination of protein studies with analyses that have
genetic components such as mRNA, genomics, and yeast twohybrid
Examples of important proteomic questions:
1) What proteins are present?
2) What other proteins does a particular protein interact with
(networks)?
3) What does a particular protein look like (structure)?
5/6/2017
4
Genomics vs. proteomics
Genomics has provided spectacular amounts of data, but
most of it remains uninterpretable at our current level of
understanding.
In some ways, genomics raises more questions than it
answers.
The emerging field of proteomics promises to answer some
of those questions by systematically studying all of the
proteins encoded by the genome.
5/6/2017
5
1 gene = 1 protein?


1 gene is no longer equal to one protein
 In fact, the definition of a gene is debatable. (ORF, promoter,
pseudogene, gene product, etc)
1 gene = how many proteins?
 There are only 30,000 genes in the human genome, yet there are
more than 100,000 proteins in the human proteome.
 Actually, cataloguing the human proteome requires much more
than just 100K proteins.
 30,000 genes x myriad of modifications >> 100K protein forms!


5/6/2017
Modifications include: alternate RNA splicing, chemical modifications,
cleavage
Chemical modifications include: phosphorylation, acetylation,
glycosylation, and many more.
6
Why proteomics?

Annotation of genomes, i.e. functional annotation






Genome + proteome = annotation
Protein Function
Protein Post-Translational Modification
Protein Localization and Compartmentalization
Protein-Protein Interactions
Protein Expression Studies

5/6/2017
Differential gene expression is not the answer
7
Microarray data doesn’t correlate
perfectly with protein expression levels
Analysis of mRNA transcripts with microarray has
provided dynamic information regarding which genes are
expressed in cells under a given set of experimental
conditions, yielding clues as to which proteins are
involved in certain pathways and disease states.
However, differences in the half-lives of RNA and
proteins, as well as post-translational modifications
important to protein function prevent mRNA profiles from
being perfectly correlated to the cells’ actual protein
profiles.
5/6/2017
8
Introduction to proteomics



Composition of the proteome depends on cell type,
developmental phase and conditions
Proteome analyses are still struggling to solve the
”basic proteome” of different cells and tissues or
limited changes under changing conditions or during
processes
Current methods can only ”see” the most abundant
proteins
5/6/2017
9
Types of proteomics
Protein Expression

Quantitative study of protein expression between samples
that differ by some variable
Structural Proteomics

Goal is to map out the 3-D structure of proteins and protein
complexes
Functional Proteomics

To study protein-protein interaction, 3-D structures, cellular
localization and PTMS in order to understand the
physiological function of the whole set of proteome.
5/6/2017
10
Large-scale protein analysis




2D protein gels
Yeast two-hybrid
Rosetta Stone approach
Pathways
5/6/2017
11
2D protein electrophoresis and
mass spectrometry
5/6/2017
12
Two-dimensional protein gels
First dimension: isoelectric focusing
Electrophorese ampholytes to establish
a pH gradient
Can use a pre-made strip
Proteins migrate to their isoelectric point
(pI) then stop (net charge is zero)
Range of pI typically 4-9 (5-8 most common)
5/6/2017
13
Two-dimensional protein gels
Second dimension: SDS-PAGE
Electrophorese proteins through an acrylamide
matrix
Proteins are charged and migrate through an
electric field v = Eq / d6prh
Conditions are denaturing
Can resolve hundreds to thousands of proteins
5/6/2017
14
5/6/2017
15
Proteins identified on 2D gels (IEF/SDS-PAGE)
Protein mass analysis by MALDI-TOF
-- done at core facilities
-- often detect posttranslational modifications
-- matrix assisted laser desorption/ionization
time-of-flight spectroscopy
5/6/2017
16
Evaluation of 2D gels (IEF/SDS-PAGE)
Advantages:
Visualize hundreds to thousands of proteins
Improved identification of protein spots
Disadvantages:
Limited number of samples can be processed
Mostly abundant proteins visualized
Technically difficult
Labor-intensive, not really ”high-throughput” methods
5/6/2017
17
Yeast-Two-hybrid (Y2H)

Aim:


Identify pairs of physical interactions among
proteins.
Solution:

5/6/2017
Use the transcription mechanism of the cell
18
Yeast-two-hybrid: Principles

Recap of biology:

Protein vs. domain




p1
d2
d1
p2
d4
d3
d5
A protein is composed of modules or
domains
DNA
Domains are individually folded units
TRANSCRIPTION
within the same protein chain.
The presence of multiple domains in
RNA
a protein allow the protein to
perform different functions.
The central dogma of biology
TRANSLATION
PROTEIN
5/6/2017
19
Yeast-two-hybrid: Principles

Transcriptional activator (TA)



Protein that is required to activate transcription
A DNA-binding domain (BD): binding to DNA,
An activation domain (AD): activating transcription of the DNA
Normal transcription requires both the DNA-binding domain (BD) and the activation
domain (AD) of a transcriptional activator (TA).
5/6/2017
20
Yeast-two-hybrid: Principles



The binding domain and the activation domain do not
necessarily have to be on the same protein.
In fact, a protein with a DNA binding domain can
activate transcription when simply bound to another
protein containing an activation domain
this principle forms the basis for the yeast two-hybrid
technique
5/6/2017
21
Yeast-two-hybrid: Principles

Major components of a Yeast-two-hybrid experiment:

Bait protein – the protein of interest (X): with a DNA binding
domain attached to its N-terminus

Prey protein – its potential binding partner (Y): fused to an
activation domain

A reporter gene (R): a gene whose protein product can be easily
detected and measured
5/6/2017
Protein X interacts with protein Y
the reporter gene is transcribed
X and Y form a functional
transcriptional activator
Use the reporter produced
as a measure of interaction
between X and Y
22
Yeast two-hybrid transcription
The yeast two-hybrid technique measures protein-protein interactions by
measuring transcription of a reporter gene. If protein X and protein Y
interact, then their DNA-binding domain and activation domain will
combine to form a functional transcriptional activator (TA). The TA will
then proceed to transcribe the reporter gene that is paired with its promoter.
5/6/2017
23
Yeast two-hybrid screens



Screen a library of proteins for
potential binding partner
Identifying interacting proteins in a
pairwise fashion
Feasible at a large scale (genome
scale)
Bait-prey model
Z
prey
bait
X
Y
A
Activation
Domain
Prey
Protein
Bait
Protein
Binding
Domain
5/6/2017
Reporter Gene
24
http://depts.washington.edu/sfields/
5/6/2017
25
red = cellular role & subcellular localization of interacting proteins are identical;
blue = localiations are identical; green = cellular roles are identical
5/6/2017
26
Y2H


Identify proteins that are physically associated in vivo.
Use yeast S. cerevisiae as a host

Disadvantage


Advantage


Yeast is closer to higher eukaryotics than in vitro experiments or
those systems based on bacterial hosts
Weak and transient interactions



The fused proteins must be able to fold correctly and exist as a
stable protein inside the yeast cells
Often the most interesting in signaling cascades
Are more readily detected in two-hybrid since the reporter gene
strategy results in a significant amplification.
Always a trade-off between the identification of weak
interactions and the number of false positives
5/6/2017
27
Low overlap among independent experiments

Two sets of independent
experiments


Ito et al PNAS 1999
Uetz et al Nature 2000
<4%
Uetz et al.
1445
1244 201 4274
Ito et al.
4475
interactions
Uetz et al.
High false positives and false
negatives in yeast-two hybrid data
<23%
1337
482
855 2422
Ito et al.
3277
proteins
5/6/2017
28
False positives




Proteins with transcription activation activity (bait works
by itself)
Proteins that normally never see each other (e.g. due to
the time/space constraints) are expressed together and
may be sticky
Proteins are expressed at high levels and this promotes
promiscuous interaction
Another protein bridges the two interacting partners
5/6/2017
29
False negatives

Proteins become toxic upon expression in yeast




Proteins are toxic when expressed and targeted into the
yeast nucleus.
Proteins proteolyse essential yeast proteins or proteins
essential for the system like the DNA binding domain or the
activation domain.
Proteins don’t get into the nucleus (membrane protein
esp.)
Proteins are not modified correctly in heterologous
environment
5/6/2017
30
Final Remark on Y2H



Although the outcome of a screening often results in
many new hypotheses, they still need to be validated by
other techniques.
There is enough reason to remain sceptic about twohybrid screenings but the most convincing argument in
favor of the two-hybrid is the number and speed
Referred to as functional screens

5/6/2017
Interacting proteins might give a functional hint if at least
one of the partners has a known functional commitment in
a well understood signaling pathway.
31
Analysis of protein complexes

Aim:


Identification of complexes and their sub units.
Solution: a two step method


5/6/2017
Isolation of only relevant complexes
Identification of complex units.
32
Affinity chromatography/mass spec

Major methods





High throughput mass spectrometric protein
complex identification (HMSPCI)
Tandem affinity purification (TAP)
Again, bait – prey model
Very sensitive method
Identify multi-protein complexes

5/6/2017
Not really possible in yeast two-hybrid
33
Methods
1
2
1.
Attach tags to bait proteins


3


4
5
6-9
Kumar and Snyder, 2002
5/6/2017
Introduce DNA encoding
these into cells
Cells express modified
proteins
Proteins form complexes
with other proteins in vivo
Cells have to express
modified protein properly
 Tag can interfere with
protein folding and function
 Overexpressed protein
may be toxic to cell
34
Methods
1
2. Bait proteins and associated
proteins are precipitated on an
affinity column
2
•
•
•
3
Tag sticks to column along with
protein complex
Elute other proteins
Elute tagged protein
3. Resolve proteins on an SDSPAGE gel
•
Separate by charge & weight
4. Cut out protein bands
4
6-9
5
•
5. Digest protein bands with trypsin

5/6/2017
Proteins of same size will be in
same band
Results in segments of proteins
35
Methods
Mass spectrometry to analyze protein composition:
6. Samples are vaporized and ionized
7. Ions enter mass analyzer and are separated by mass to charge
ratio
8. Ions are detected and a signal generated
9. Compare signal to database to identify proteins in complex
5/6/2017
36
Methods
5/6/2017
37
Affinity chromatography/mass spec
Data on complexes deposited in databases
http://yeast.cellzome.com
http://www.bind.ca
5/6/2017
38
5/6/2017
39
5/6/2017
40
Affinity chromatography/mass spec
False positives:
• sticky proteins
GST Bait protein
5/6/2017
41
Affinity chromatography/mass spec
False negatives:
• Bait must be properly localized and
in its native condition
• Affinity tag may interfere with function
• Transient protein interactions may be missed
• Highly specific physiological conditions
may be required
• Bias against hydrophobic, and small proteins
GST Bait protein
5/6/2017
42
The Rosetta Stone approach
Marcotte et al. (1999) and other groups hypothesized
that some pairs of interacting proteins are encoded by
two genes in many genomes, but occasionally they
are fused into a single gene.
By scanning many genomes for examples of “fused
genes,” several thousand protein-protein predictions
have been made.
5/6/2017
43
The Rosetta Stone approach
Yeast topoisomerase II
E. coli
gyrase B
E. coli
gyrase A
5/6/2017
Fig. 8.23
Page 256
44
Function Prediction from Interaction


It is possible to deduct functions of a protein through
the functions of its interaction partners.
A difficult task:


Within-class, cross-class interactions
Available methods based on protein interaction




5/6/2017
Neighboring counting method
Methods based on χ2-statistics
Markov Random Fields
Simulated annealing
45