Download presentation source

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ubiquitin wikipedia , lookup

LSm wikipedia , lookup

Phosphorylation wikipedia , lookup

Endomembrane system wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Proteasome wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

Signal transduction wikipedia , lookup

Bacterial microcompartment wikipedia , lookup

Protein phosphorylation wikipedia , lookup

Protein structure prediction wikipedia , lookup

SR protein wikipedia , lookup

Protein wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

List of types of proteins wikipedia , lookup

Protein moonlighting wikipedia , lookup

Cyclol wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Western blot wikipedia , lookup

Proteomics wikipedia , lookup

Proteolysis wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Transcript
Introduction
• Genomics involves study of mRNA expression-the full set of genetic
information in an organism contains the recipes for making proteins
• Proteins constitute the “bricks and mortar” of cells and do most of the
work
• Proteins distinguish various types of cells, since all cells have
essentially the same “Genome” their differences are dictated by which
genes are active and the corresponding proteins that are made
• Similarly, diseased cells may produce dissimilar proteins to healthy
cells
• However task of studying proteins is often more difficult than genes
(e.g. post-translational modifications can dramatically alter protein
function)
Proteome
• Term was first proposed by an Australian
post-doc, Marc Wilkins in 1994
• “Proteome”-the protein complement
encoded by a genome
The taxonomy of genomics biology
Proteomics
• Identification of all the proteins made in a given cell, tissue
or organism
• Identification of the intracellular networks associated with
these proteins
• Identification of the precise 3D-structure of relevant
proteins to enable researchers to identify potential drug
targets to turn protein “on or off”
• Proteomics very much requires a coordinated focus
involving physicists, chemists, biologists and computer
scientists
Why Proteomics
• Major challenge-how do we go from the treasure chest of
information yielded by genomics in understanding cellular
function
• Genomics based approaches initially use computer-based
similarity searches against proteins of known function
• Results may allow some broad inferences to be made about
possible function
• However, a significant percentage (>30%) of the
sequences thus far ascertained seem to code for proteins
that are unrelated at this level to proteins of known
function
Why Proteomics
• Beyond the genetic make-up of an individual or organism,
many other factors determine gene and ultimately protein
expression and therefore affect proteins directly
• These include environmental factors such as pH, hypoxia,
drug treatment to name a few
• Examination of the genome alone can not take into account
complex multigenic processes such as ageing, stress,
disease or the fact that the cellular phenotype is influenced
by the networks created by interaction between pathways
that are regulated in a coordinated way or that overlap
Why Proteomics
• Genomic analysis has certainly provided us with much insight into the
possible role of particular genes in disease
• However proteins are the functional output of the cell and their
dynamic nature in specific biological contexts is critical
• The expression or function of proteins is modulated at many diverse
points from transcription to post-translation and very little of this can
be predicted from a simple analysis of nucleic acids alone
• There is generally poor correlation between the abundance of mRNA
transcribed from the DNA and the respective proteins translated from
that mRNA
• Furthermore, transcript splicing can yield different protein forms
• Proteins can undergo extensive modifications such as glycosylation,
acetylation, and phosphorylation which can lead to multiple protein
products from the same gene
Proteomics Tools
• The core methodologies for displaying the
proteome are a combination of advanced
separation techniques principally involving twodimensional electrophoresis (2D-GE) and mass
spectrometry
2D-GE: basic methodology
•
•
•
•
•
Sample (tissue, serum, cell extract) is solubilized and the proteins are
denatured into polypeptide components
This mixture is separated by isoelectric focusing (IEF); on the application of a
current, the charged polypeptide subunits migrate in a polyacrylamide gel strip
that contains an immobilized pH gradient until they reach the pH at which
their overall charge is neutral (isoelctric point or pI), hence prodcuing a gel
strip with distinct protein bands along its length
This strip is applied to the edge of a rectangular slab of polyacrylamide gel
containing SDS. The focused polypeptides migrate in an electric current into
the second gel and undergo separation on the basis of their molecular size
The resultant gel is stained (Coomassie, silver, fluorescent stains) and spots are
visualized by eye or an imager. Typically 1000-3000 spots can be visualized
with silver. Complementary techniques, e.g. immunoblotting allow greater
sensitivity for specific molecules.
Multiple forms of individual proteins can be visualized and the particular
subset of proteins examined from the proteome is determined by factors such
as initial solubilization conditions, pH range of the IPG and gel gradient
General schematic of 2D-PAGE
for protein identification in Toxicology
General strategy for proteomic analysis
Sample growth
Sample solubilization
Isoelectric focusing (IPG)
2D-PAGE
Immunoblot (Western)
Image analysis
Isolation of spots of interest
Trypsin digestion of proteins
MS analysis of tryptic fragments
Identification of proteins
Nature of IPG determines spot location on
2D-PAGE
Limitations of 2D-GE
•
•
•
•
•
In the large scale analysis of proteomics, 2D-GE has been the major workhorse
over the last 20 years-its unique application in being able to distinguish posttranslational modifications and is analytically quantitative
However despite the significant improvements (e.g. immobilized pH
gradients) to the technique and its coupling with MS analysis it is still difficult
to automate
Although at first glance the resolution of 2D seems very impressive, it still
lags behind the enormous diversity of proteins and thus comigrating protein
spots are not uncommon
This is especially of concern when trying to distinguish between highly
abundant proteins e.g. actin (108 molecules/cell) and low abundant like
transcription factors (100-1000)-this is beyond the dynamic range of 2D
Enrichment or prefractionation can often overcome such discrepancies
Limitations of 2D-GE
• Chemical heterogeneity of proteins also presents a major limitation
• Thus the full range of pIs and MWs of proteins exceeds what can
routinely be analyzed on 2D-GE. However improvements to IPGs is
expected to overcome some of these constraints and greatly imrpove
the coverage of the entire proteome of the cell
• Problems liked with extraction and solubilization of proteins prior to
2D-GE present an even greater challenge-especially for extremely
hydrophobic proteins, such as membrane and nuclear proteins. Again
recent advances in buffer composition has diminished the scale of this
problem
Protein identification and characterization
• Specialized imaging software allows for a more detailed analysis of
spot identification and comparison between gels, and treatments
• By a process of subtraction, differences (e.g. presence, absence, or
intensity of proteins or different forms) between healthy and diseased
samples can be revealed
• Cross-references to protein databases allow assignment by known pIs
and apparent molecular size. Ultimate protein identification requires
spot digestion (enzymatic) and analysis of charge and mass by mass
spectrometry (MS)
• Spot cutter tools can be coupled to image analysis tools and in gel
tryptic digestion techniques in 96 or 384 well format can greatly
reduce the bottle-neck in sample identification by MS
Protein analysis by MS
•
•
•
•
Compared to sequencing, MS is more sensitive (femtomole to attomole
concentrations) and is higher throughput
Digestion of excised spot with trypsin results in a mixture of peptides. These
are ionized by electrospray ionization from liquid state or matrix-assisted laser
desorption ionization from solid state (MALDI-TOF) and the mass of the ions
is measured by various coupled analyzers (e.g. time of flight measures the time
for ions to travel from the source to the detector, resulting in a peptide
fingerprint
The resultant signature is compared with the peptide masses predicted from
theoretical digestion of protein sequences found in databases-identification of
protein!
Tandem MS allows one to obtain actual protein sequence information-discrete
peptide ions can be selected and further fragmented, and complex algorithms
employed to correlate exp data with database derived peptide sequences
Schematic of MALDI process and instrument
Schematic of a QTOPF instrument
MALDI peptide identification of a protein
MS detection/ sensitivity limits
Assessment of post-translational
modification by proteomics
Nature Biotech. 2001 (19)
379-382
General strategy for MS-based Id of proteins and
post-translational modifications
Proteomic bioinformatics
• Proteomic analysis requires highly sophisticated
bioinformatic tools in not only electrophoretic and MS
separation but also in the assignement of physicochemical
properties and prediction of potential post-translational
modifications and 3D structures
• Databases exist for the protein maps of a broad range of
organisms, tissues, and disease states
• Ultimately, given the the dynamic nature of the proteome,
complex experimental details and related results need to be
extrapolated in the context of the relevant biochemical
pathways or disease implications
Initiate database interrogations
Coordinate independent retrieval
Interpretation: Co-occurrence and rank
Re-ranking
Decision: “Novel or previously studied”
Comparison with data generated by genomic analysis
Published access tools for protein ID
and databases on the web
Proteomics applications
•
•
•
•
•
•
•
Pharmaceutical development-functional genomics and proteomics have
generated a plethora of new potential drug targets
Has increased efficiency in lead optimization and preclinical phases of drug
development
Signature patterns of drug toxicity (on/off, dose response, temporal effects)
Resultant evaluation of drug toxicity and drug-drug interaction is further
enhanced by both procedures e.g. drug toxicity of cyclosporine in mediating
nephrotoxicity and liver toxicity of etomoxir-a potential anti-diabetic (2D-GE
patterns revealed aberrant protein expression profiles in drug treatment
Neurological disorders
Heart disease
Screening of microbial protein profiles conferring drug resistance
Assessment of acetaminophen toxicity in mouse liver
“Our Approach”
A
Comparison of sensitivity
of silver (A) and
fluorescence-based
SYPRO Ruby (B)
stains for protein
detection by 2D-PAGE
B
Overlay comparing two separate 2D gels
(Serum vs SF) demonstrating versatility of
PD Quest software in spot assignment
Ruby-stain
2D-PAGE examination
of Ubiquitination
status of proteins isolated
from serum-starved
p53 (+/+) MEF:
Triangles identify spots that
are common to both native
gel and immunoblot
-Ub
2D-PAGE analysis
of 24h treatment in
Serum-free (SF) or
Serum controls in
P53 (+/+) MEF:
Ubiquitination as
Monitored by immunoblotting
SF
Serum
pH
MW
3
10
Poly-Ub
2D-PAGE analysis
of 24h treatment with
2.5 M MeHg or
Lactacystin in
P53 (+/+) MEF
MeHg
Poly-Ub
Lactacystin
“Our Approach”
MS analysis
2D-GE modifications
Prediction of protein
Expression via virtual gel
Future developments
•
•
•
•
Towards a gel-free approach
Automation
More prediction based approaches
Combinatorial functional genomics