Download Enhancing Sequence Coverage in Proteomics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Circular dichroism wikipedia , lookup

Structural alignment wikipedia , lookup

Protein design wikipedia , lookup

Protein folding wikipedia , lookup

Protein domain wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Degradomics wikipedia , lookup

Protein wikipedia , lookup

Protein structure prediction wikipedia , lookup

List of types of proteins wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein purification wikipedia , lookup

Homology modeling wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein moonlighting wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Cyclol wikipedia , lookup

Western blot wikipedia , lookup

Proteomics wikipedia , lookup

Enzyme wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Transcript
Enhancing Sequence Coverage in Proteomics Studies by Using a Combination of Proteolytic Enzymes
Dominic Baeumlisberger2, Christopher Kurz3 Tabiwang N. Arrey1, Marion Rohmer2, Carola Schiller3, Thomas Moehring1, Walter A. Möller3 and Michael Karas2
1Thermo Fisher Scientific, Bremen, Germany, 2Institute for Pharmaceutical Chemistry, Goethe-University, Frankfurt am Main, Germany, 3Department of Pharmacology,
Goethe-University, Frankfurt am Main, Germany
Overview
Results
Purpose: Increase sequence coverage and overall confidence of protein identification
using a combination of datasets from three enzyme digests.
Results: Combination of datasets from multiple enzyme digests enabled improved
sequence coverage of proteins, increased the total number of unique peptide and
protein groups identified, and minimized false-positive discovery rates.
The Q ExactiveTM mass spectrometer provides not only rich fragmentation but also
immonium ions, which are important for peptide correlation. Coupled with the high
resolution and high mass accuracy in both MS and MS/MS, reliable identification is
possible. This is especially very important for peptides generated using less-specific
enzymes. Figure 1 shows triplicate runs of individual enzyme digests. Reproducibility
rates of 69.9%, 62.3 % and 58. 25 % were obtained for trypsin, elastase and
chymotrypsin, respectively. However, at the peptide level, it decreased to 57%,
46.92 % and 42. 97 % (see Figure 2) respectively.
Introduction
FIGURE 1. Proteins identified in triplicate experiments of each enzyme digest.
Besides being the main site of adenosine triphosphate (ATP), mitochondria are
associated with a range of other processes and diseases such as cell growth, cellular
differentiation, mitochondrial disorder, aging processes and cardiac dysfunctions. To
obtain a better understanding of these mitochondrial processes and diseases, we
need to identify the proteins and proteins modifications involved.
The ability to identify and characterize large numbers of proteins from medium- to
high- complexity samples has made mass spectrometry (MS) coupled to reversed
phase high-performance liquid chromatography (HPLC) a common analytical
technique in proteomics. Usually, the extracted proteins are digested with a suitable
protease and the resulting peptide mixture is separated and analyzed. Trypsin is the
common enzyme of choice for proteomics experiments. Digestion with trypsin (or any
single enzyme in general) often results in the identification of large numbers of
proteins, but sequence coverage is frequently incomplete. If maximum sequence
coverage is desired (e.g. when studying changes in protein modification or different
isoforms), then signals covering all or most of the protein sequence are needed.
Different approaches have been used to improve protein sequence coverage in
proteomics. In this study, data obtained from individual trypsin, chymotrypsin and
elastase digests were combined to significantly improve sequence coverage of
proteins.
A common phenomenon which is observed with peptides generated by less-specific
enzymes such as elastase, is the absence of charge localization at either the N- or Cterminus. Fragmentation of these peptides results in lack of extended b- or y-ion series
and an increase in internal fragment ions. Due to the basic moiety (TMT0), extended bions were generated. Figure 4 shows an example of a tandem MS of this
peptide, IQGGVLAGDVTDVLLLDVTPL with monoisotopic mass of 2408.38506.
Sample Preparation
Liquid Chromatography
Samples were loaded onto a Thermo Scientific Acclaim PepMap100 C18 pre-column
(100 µm × 2 cm, C18 5 µm, 100 Å), and separated on a reversed-phase Acclaim ®
PepMapTM100 C18 column (75 µm × 15 cm, C18 3 µm, 120 Å) using the Thermo
Scientific EASY-nLC 1000 nanoflow HPLC. A 90 min gradient at a flow rate of
300 nL/min was used for the separation. Triplicate runs of individual enzyme digests
were performed.
FIGURE 2. Peptides identified in triplicate experiments of each enzyme digest.
FIGURE 4. Tandem MS and annotated spectrum of the peptide
AIQGGVLAGDVTDVLLLDVTPL generated from elastase digest. b-/a-type ions
are shown in red while y-type ions in blue colour. The mass deviation of this
peptide was 0.01 ppm (IonScore: 136) in MS and below 10 ppm for fragment
ions in MS/MS.
FIGURE 7. Amino acid sequence of ATP synthase subunit beta showing
sections of the protein that was identified with annotated known modification
(from UniProt). Acetylation is represented by A and phosphorylation by P.
FIGURE 6. A) Sequence coverage achieved using different enzymes for a 453
amino acid protein Cytochrome b-c1 complex subunit 2. Green represents
sections of the protein that were identified and white, the sections that were not
covered by any of the identified peptides. The sequence coverage increased by
7.3 %, 45.2 %, and 56.4% for trypsin, elastase and chymotrypsin respectively.
Combining all datasets, a net increase of 32.8 % is obtained. B) Comparison of
sequence coverage from a single enzyme digest (trypsin) to that of the
combined dataset for identified membrane proteins. Dark blue bars represent
coverage obtain with trypsin alone and red bars from the sum of all enzymes
used.
Trypsin 87.86%
A
1
51
101
15 1
201
1
51
101
15 1
201
25 1
301
3 51
401
453
25 1
301
3 51
401
453
b6+
1.5
Chymotrypsin 60.26%
1.0
Conclusion
The use of three different enzymes in proteomics studies enabled an average
increase in total number of peptides of approximately 227.5 % and protein groups
of about 68.8 % identified.
The use of three different enzymes led to an average increase in protein
sequence coverage of about 31 %.
The use of three different enzymes improved overall confidence in protein
identification
The use of three different enzymes aided the study of changes in protein
sequences and post-translational modifications.
The high mass accuracy in both MS and MS/MS minimized false discovery rate
(FDR).
1
51
101
15 1
201
1
51
101
15 1
201
25 1
301
3 51
401
453
In spite of the increase in sequence coverage with multiple enzyme digests, the
highest number of protein and peptide identification for single proteolytic digest
was obtained with trypsin.
3 51
401
453
b7+
b10+
b5+
y2+
b62+
b1+
0.0
b72+
b2+
b82+
b3+
y3+
b10 2 +
300
400
m/z
500
b8+
a6+
All 3 enzymes 94.26%
b9+
b4+
b11+
a7+ a8+
600
b13+
25 1
301
References
b14+
b12+
700 800 900 1000 1100 1200 1300 1400 1500 1600 1700
m/z
FTMS + p NSI d Full ms2 [email protected] [120.00-2480.00]
B
755.4594
100
100.00
90
80
Relative Abundance
The raw data files were searched using Thermo Scientific Proteome Discoverer
software v. 1.3 with MascotTM v. 2.2.1 search engine (Matrix Science Ltd, London UK).
The peptide tolerance for MS was set at 15 ppm and for MS/MS 20 mmu. A highconfidence peptide filter with FDR of 1% was used.
The use of multiple enzyme digests in proteomic studies might enable proteolytic
cleavages at sites further away from modified peptides, thereby overcoming
incomplete digestion caused by these protein modifications. For example, with a
combination of datasets, peptides covering almost all known modifications (present in
UniProt) from ATP synthase subunit beta were identified (figure 7). This was not true
for all the identified proteins; nevertheless, a reasonable number of modified peptides
were identified. This shows that to some extent, some portions of the proteome are
simply inaccessible following digestion with a single protease. Therefore, in
combination with technical replicate, multiple proteases can be used to significantly
improve sequence coverage of proteins from a proteome and increase the confidence
degree in protein identification. In addition, proteins that were identified by individual
enzymes would have been missed, if only this enzyme was used in this experiment.
2.0
0.5
Data Analysis
FIGURE 5. Total number of protein groups identified from triplicate runs of all
enzymes. The highest number of proteins were identified with trypsin.
Elastase 64.90%
Mass Spectrometry
All MS and MS/MS spectra were acquired in positive ion mode using a Thermo
Scientific Q Exactive hybrid quadrupole-Orbitrap mass spectrometer. Full-scan data
was obtained at a resolution of 70,000 (at m/z 200), demanding 1e6 ions in the mass
range 350–1800 Da. For the tandem MS, 1e5 charges were required and the
fragment ions were measured at a resolution of 17,500 (at m/z 200). The 10 most
intensive ions in a spectrum were selected for fragmentation with a maximum
injection time of 200ms.
In general, 992 protein groups were identified in all enzyme digests, of which 18.25%
were mitochondrial membrane proteins. Approximately 33% of the total number of
identified proteins were present in the combined dataset (Figure 5). This not only lead to
a significant increase in the number of protein groups identified but also enhanced the
overall sequence coverage. However, the sequence coverage varied from protein to
protein. For example, 100% or close to 100% sequence coverage was achieved for the
small proteins (>100 amino acid) NADH dehydrogenase [ubiquinone] 1 alpha
subcomplex subunit or cytochrome b-c1 complex subunit, while for larger proteins such
as cytochrome b-c1 complex subunit 2 (> 400 amino acid) as shown in Figure
6, sequence coverage above 90% was obtained.
70
60
868.5435
50
40
ΣCoverage
Coverage (Trypsin)
80.00
Sequence Coverage
Methods
Purified mitochondrial membrane proteins from mouse brain were dissolved in 25 mM
triethylammonium bicarbonate buffer. Disulfide bridges were reduced in
dithiothreitol, alkylated with iodoacetamide and digested over night with
trypsin, chymotrypsin and elastase. Digestion was stopped by freezing at −20°C. Just
before separation, each digest was labeled with the Thermo Scientific Amine-Reactive
Tandem Mass Tag (TMT0) Reagent, to improve fragmentation, especially of the
elastase and chymotrypsin generated peptides.
FIGURE 3. Venn diagram showing unique peptides identified from triplicates
experiments in all 3 enzyme digest. As expected, no peptide identified was
common to all three enzyme preparations.
Intensity 10^6
Methods: Peptides generated by proteolytic digestion of mitochondrial membrane
were analyzed using a hybrid quadrupole-OrbitrapTM mass spectrometer.
In total 12,007 peptides from a combination of triplicate dataset of 3 enzyme digests
were identified. As expected, no peptide common to all three enzyme digests was
identified. Less than 1% of the total number of identified peptides were identified in two
enzyme digests. As shown in Figure 3, mostly unique peptides were identified and
common peptide sequences in most cases cover regions that could not be identified
by one enzyme digest. While the shared peptides between trypsin /chymotrypsin and
trypsin/elastase contained basically R and K amino acids at their C termini, 54.05 % of
those shared between chymotrypsin and elastase were outside the define cleavage
sites (Y, W, F, M, L) of chymotrypsin. Most of these peptides have A, V, L and S at their
C-termini, typical cleavage sites for elastase.
1111.6288
656.3893
229.1541
939.5805
434.7753
30
20
0.00
1210.6965
1311.7445
1525.8417
0
400
A. Gardner and G. R. Boles, Curr. Psychiatry Rev., 2005, 1 (3): 255–271.
3.
A. E. Speers and C. C. Wu, Chem Rev., 2007, 107(8):3687–3714.
4.
B. Rietschel et al. MCP, 2009, 8(5):1029-43.
5.
D. Baeumlisberger et al. Proteomics, 2010, 10(21):3905-9.
40.00
1426.7715
996.6036
10
200
G. Choudhary et al., JPR, 2003, 2 (1), 59–67.
2.
20.00
470.2938
542.3498
301.2065
60.00
1.
600
800
1000
m/z
1200
1400
1600
1
21
41
61
81
101
121
141
161
Total number of identifed membrane proteins
181
201
Mascot is a registered product of Matrix Science Ltd. All other trademarks are the property of Thermo Fisher
Scientific an its subsidiaries.
This information is not intended to encourage use of these products in any manners that might infringe the
intellectual property rights of others.