* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download In Depth Analysis of Protein Amino Acid Sequence and PTMs with
Intrinsically disordered proteins wikipedia , lookup
Protein folding wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Structural alignment wikipedia , lookup
Protein domain wikipedia , lookup
Circular dichroism wikipedia , lookup
Protein design wikipedia , lookup
Protein moonlighting wikipedia , lookup
Protein purification wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Degradomics wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
Western blot wikipedia , lookup
Homology modeling wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang2; Baozhen Shan1; Bin Ma2 1Bioinformatics Solutions Inc, Canada 2University of Waterloo, Canada Protein sequence analysis • Problem Complete protein sequence coverage o antibody confirmation o biomarker discovery Database search software along is insufficient Protein sequence analysis • Possible reasons for incomplete coverage • “non-database” peptides o unexpected modifications o mutated residues o novel peptide • database errors • Meanwhile Large amount of high-quality spectra are not matched. Proposed workflow for in-depth analysis • A workflow to identify both the database and “non-database” peptides • Objective • Maximize protein sequence coverage • Explain more high-quality MS/MS spectra Proposed workflow for in-depth analysis • Workflow Multiple enzyme • Multiple protein digests with different enzymes • High accuracy MS for both precursor and fragment ions Proposed workflow for in-depth analysis • Workflow Multiple enzyme • Identify de novo sequence tags • Reveal a set of high quality spectra PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom. 2003;17(20):2337-42. Proposed workflow for in-depth analysis • Workflow Multiple enzyme • Identify database peptides. • Database search result validated by de novo tags • Reveal a set of confident proteins PEAKS DB: De Novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics 2012; 11:10.1074, 1–8. Proposed workflow for in-depth analysis • Workflow Multiple enzyme For input spectra with + highly confident de novo tags - no significant database matches • Identify peptides with unexpected modifications • Peptides from the set of confident proteins are “modified” in-silico by trying all possible modifications in UNIMOD. • Speed up by de novo tags PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications. Journal of Proteome Research 10.7 (2011) : 2930-2936 Proposed workflow for in-depth analysis • Workflow Multiple enzyme For input spectra with + highly confident de novo tags - no significant database matches • Identify peptides with mutation, such as residue insertion, deletion, and substitution. • Screen the protein database to find short sequences similar to de novo tags • Use both the de novo tags and database sequence to reconstruct the most probable sequences that match the spectrum SPIDER: software for protein identification from sequence tags with de novo sequencing error. J Bioinform Comput Biol. 2005 Jun;3(3):697-716. Proposed workflow for in-depth analysis • Workflow Multiple enzyme Unassigned de novo sequence tags are reported as possible novel peptides Proposed workflow for in-depth analysis • Result integration In-depth analysis of BSA Test the workflow with the standard bovine serum albumin • Sample • Pure ALBU_BOVIN from SIGMA • 3 digests with Trypsin, LysC, GluC. • LC-MS/MS with Thermo LTQ-Orbitrap XL. Trypsin LysC GluC • Workflow • Workflow implemented in PEAK 6 • 3 digests in one project • Searched database: Swiss-Prot LC-MS/MS Workflow Result • More PSMs are identified in each additional step: 5,152 MS/MS spectra 1,737 PSMs 906 PSMs 44 PSMs 38 MS/MS spectra Filtered at 1% FDR 1,737 -> 2,687 PSMs PEAKS ALC score > 70% Result • BSA coverage 98% 96% 94% 92% 90% 88% 86% 84% 82% 96% 87% Trypsin + PEAKS DB Proposed workflow The uncovered 4% is in the protein N-terminal region, which is mostly likely cleaved-off and not in the purchased sample1. 1specific binding site (Asp-Thr-His-Lys) for Cu(II) ions. T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574 Result • Contaminants • Identified with at least 3 unique peptides. – Human keratin proteins (K2C1_HUMAN and K1C_HUMAN) – Bacteria protein (SSPA_STAAR) – Trypsin (TRY1_BOVIN) Result • PTMs • Unsuspected modifications identified by PTM search – Three PTMs specified in database search » » » Carbamidomethylation (C) Oxidation (M) Deamidation (NQ) Result • Mutation • 214th amino acid A T • Brown 1975, Fed. Proc. 34:591 Result • Unexplained de novo tags • Might be… – Novel peptides outside of the searched database KK.QTALVELLK.HK ||||||| DPALVELLKK Summary • A software workflow proposed for in-depth protein sequence analysis • Found many things in a “pure” sample – Contaminants – Unsuspected PTMs – Mutations • Improved protein sequence coverage – BSA coverage: 87% -> 96% • Explained more high-quality MS/MS spectra – Identified MS/MS spectra: 1,737 -> 2,687 Q/A