Download A De Novo Approach for Untargeted Post-Translational Modification Prediction Using Tandem Mass Spectrometry and Integer Linear Optimization

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Proteomics wikipedia , lookup

Protein structure prediction wikipedia , lookup

Homology modeling wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Transcript
A De Novo Approach for Untargeted Post-Translational
Modification Prediction Using Tandem Mass Spectrometry
and Integer Linear Optimization
Richard C. Baliban*, Peter A. DiMaggio, Jr., and Christodoulos A. Floudas
Department of Chemical Engineering, Princeton University
Nicolas L. Young, Mariana D. Plazas-Mayorca, Benjamin A. Garcia
Department of Molecular Biology, Princeton University
Poster (2:20 PM)
Classification of the post-translational modifications (PTMs) of various organisms is
currently a major challenge in the field of proteomics. To circumvent the large complexity
associated with variable modification identification, an upper bound is often assigned for the
total number of (a) variable modification types or (b) variable modification sites. However,
thorough determination of a dynamic proteome requires an unbiased analysis of any and all
modifications that can occur. We have developed PILOT_PTM, which is a novel de novo
method for untargeted PTM Prediction via Integer Linear Optimization and Tandem mass
spectrometry (MS/MS). Given a template amino acid sequence, the method can predict the
modifications at all positions on the peptide sequence. That is, our model will assume that all
template positions can be post-translationally modified and will seek to determine the optimal set
of modifications among a universal list based on the MS/MS data. The method rigorously
guarantees the optimal set of modifications without having to enumerate all combinations of
possible modifications.
To verify the capability of the method across distinct data types, PILOT_PTM was
tested on several data sets including (a) unmodified peptides fragmented via Collision Induced
Dissociation (CID) on (i) ion trap, (ii) Q-TOF, and (iii) Orbitrap mass spectrometers, (b)
chemically synthesized phosphopeptides fragmented via Electron Transfer Dissociation (ETD),
(c) the 1-50 N-terminal tail of the histone H3 protein fragmented via Electron Collision
Dissociation (ECD), and (d) propionylated H3 peptides fragmented via CID. The results of each
data set were quantified using several metrics including the overall residue prediction accuracy,
the complete peptide prediction accuracy, and the subsequence accuracy. To benchmark the
capability of the method, the results of the (d) propionylated H3 data set were compared with
those acquired from five state-of-the-art algorithms including three hybrid sequence tag/database
approaches (InsPecT, Modi, and VEMS) and two pure database approaches (Mascot and
X!Tandem). PILOT_PTM demonstrates superior prediction accuracy for all scoring metrics
and consistently outperforms the top hybrid and database algorithms.