Download In Depth Analysis of Protein Amino Acid Sequence and PTMs with

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Intrinsically disordered proteins wikipedia , lookup

Protein wikipedia , lookup

Protein folding wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Structural alignment wikipedia , lookup

Protein domain wikipedia , lookup

Circular dichroism wikipedia , lookup

Protein design wikipedia , lookup

Cyclol wikipedia , lookup

Protein moonlighting wikipedia , lookup

Protein purification wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Degradomics wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein structure prediction wikipedia , lookup

Western blot wikipedia , lookup

Homology modeling wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Proteomics wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Transcript
In-depth Analysis of Protein Amino Acid Sequence and
PTMs with High-resolution Mass Spectrometry
Lian Yang2; Baozhen Shan1; Bin Ma2
1Bioinformatics
Solutions Inc, Canada
2University of Waterloo, Canada
Protein sequence analysis
• Problem
Complete protein sequence coverage
o antibody confirmation
o biomarker discovery
Database search software along is insufficient
Protein sequence analysis
• Possible reasons for incomplete coverage
• “non-database” peptides
o unexpected modifications
o mutated residues
o novel peptide
• database errors
• Meanwhile
Large amount of high-quality spectra are not matched.
Proposed workflow for in-depth analysis
• A workflow to identify both the database and
“non-database” peptides
• Objective
• Maximize protein sequence coverage
• Explain more high-quality MS/MS spectra
Proposed workflow for in-depth analysis
• Workflow
Multiple enzyme
• Multiple protein digests with
different enzymes
• High accuracy MS for both
precursor and fragment ions
Proposed workflow for in-depth analysis
• Workflow
Multiple enzyme
• Identify de novo sequence tags
• Reveal a set of high quality spectra
PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry.
Rapid Commun Mass Spectrom. 2003;17(20):2337-42.
Proposed workflow for in-depth analysis
• Workflow
Multiple enzyme
• Identify database peptides.
• Database search result validated
by de novo tags
• Reveal a set of confident proteins
PEAKS DB: De Novo sequencing assisted database search for sensitive and accurate peptide identification.
Mol Cell Proteomics 2012; 11:10.1074, 1–8.
Proposed workflow for in-depth analysis
• Workflow
Multiple enzyme
For input spectra with
+ highly confident de novo tags
- no significant database matches
• Identify peptides with unexpected
modifications
• Peptides from the set of confident
proteins are “modified” in-silico by trying
all possible modifications in UNIMOD.
• Speed up by de novo tags
PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications.
Journal of Proteome Research 10.7 (2011) : 2930-2936
Proposed workflow for in-depth analysis
• Workflow
Multiple enzyme
For input spectra with
+ highly confident de novo tags
- no significant database matches
• Identify peptides with mutation, such as
residue insertion, deletion, and substitution.
• Screen the protein database to find short
sequences similar to de novo tags
• Use both the de novo tags and database
sequence to reconstruct the most probable
sequences that match the spectrum
SPIDER: software for protein identification from sequence tags with de novo sequencing error.
J Bioinform Comput Biol. 2005 Jun;3(3):697-716.
Proposed workflow for in-depth analysis
• Workflow
Multiple enzyme
Unassigned de novo sequence tags
are reported as possible novel
peptides
Proposed workflow for in-depth analysis
• Result integration
In-depth analysis of BSA
Test the workflow with the standard bovine serum albumin
• Sample
• Pure ALBU_BOVIN from SIGMA
• 3 digests with Trypsin, LysC, GluC.
• LC-MS/MS with Thermo LTQ-Orbitrap XL.
Trypsin
LysC
GluC
• Workflow
• Workflow implemented in PEAK 6
• 3 digests in one project
• Searched database: Swiss-Prot
LC-MS/MS
Workflow
Result
• More PSMs are identified in each additional step:
5,152 MS/MS spectra
1,737 PSMs
906 PSMs
44 PSMs
38 MS/MS spectra
Filtered at 1% FDR
1,737 -> 2,687 PSMs
PEAKS ALC score > 70%
Result
• BSA coverage
98%
96%
94%
92%
90%
88%
86%
84%
82%
96%
87%
Trypsin + PEAKS DB
Proposed workflow
The uncovered 4% is in the protein N-terminal region, which is mostly
likely cleaved-off and not in the purchased sample1.
1specific
binding site (Asp-Thr-His-Lys) for Cu(II) ions.
T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574
Result
• Contaminants
• Identified with at least 3 unique peptides.
– Human keratin proteins (K2C1_HUMAN and K1C_HUMAN)
– Bacteria protein (SSPA_STAAR)
– Trypsin (TRY1_BOVIN)
Result
• PTMs
• Unsuspected modifications identified by PTM search
– Three PTMs specified in database search
»
»
»
Carbamidomethylation (C)
Oxidation (M)
Deamidation (NQ)
Result
• Mutation
• 214th amino acid A  T
• Brown 1975, Fed. Proc. 34:591
Result
• Unexplained de novo tags
• Might be…
–
Novel peptides outside of the searched database
KK.QTALVELLK.HK
|||||||
DPALVELLKK
Summary
• A software workflow proposed for in-depth protein
sequence analysis
• Found many things in a “pure” sample
– Contaminants
– Unsuspected PTMs
– Mutations
• Improved protein sequence coverage
– BSA coverage: 87% -> 96%
• Explained more high-quality MS/MS spectra
– Identified MS/MS spectra: 1,737 -> 2,687
Q/A