Download Supplementary Materials and Methods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Supplementary Materials and Methods
1 Data analysis
1.1 The latest Raught lab spectral library
1.1.1 Standard database search
All non-decoy spectra were searched against a concatenated target/decoy database consisting
of forward and reverse versions of the IPI human v3.82 (total 184,208 sequences) using X!Tandem
(CYCLONE 2011.12.1) with the following parameters: (1) [-10, 10] ppm and [-0.4,0.4] Da for
peptide- and fragment-mass tolerance; (2) trypsin cleavage at both termini and two missed cleavage
allowed; (3) 15.994915@M and 57.021464@C as variable and fixed modifications, respectively.
Then, the search results were validated using PeptideProphet and ProteinProphet in TPP v4.5.
1.1.2 Identification of Ub/Ubl conjugation sites using the proposed workflow
Create a combinatorial database
Target protein sequences were respectively processed using MchopNSpice with the following
parameters: spice species was H. sapiens; spice site was KX; spice mode was once per fragment;
include unmodified fragments in output; allow up to 2 protein miscleavages; allow up to 0
miscleavage in the “spice sequence”; output formatting was FASTA (single protein sequence);
mark all cleaved sites (“J”); retain comments in FASTA format without line breaks in FASTA
output. Lys-C was selected as the enzyme for Ub, NEDD8, ISG15 and ATG8. Trypsin (Lys/Arg,
do not cleave at Pro) was selected as the enzyme for SUMO1, SUMO2, SUMO3 and FAT10. The
spice sequence was itself for each Ub/Ubl.
Non-target protein sequences were digested using MchopNSpice with the following
parameters: spice species was none; spice mode was once per fragment; include unmodified
fragments in output; enzyme was trypsin (Lys/Arg, do not cleave at Pro); allow up to 2 protein
miscleavages; output formatting was FASTA (single protein sequence); mark all cleaved sites
(“J”); retain comments in FASTA format without line breaks in FASTA output.
A combinatorial fasta database was created by combining the modified target protein
sequences and the digested non-target protein sequences.
Identification of Ub/Ubl conjugation sites using UblSearch
All non-decoy spectra were resubmitted to search against the created combinatorial database
using UblSearch with the following parameters: (1) [-10, 10] ppm and [-0.4, 0.4] Da for peptideand fragment-mass tolerance; (2) [X]|[J] as cleavage site and 0 missed cleavage allowed; (3)
15.994915@M and 0.0000001@K as variable modifications, 57.021464@C as fixed modification.
1.1.3 Identification of Ub/Ubl conjugation sites using the ChopNSpice method
Create a modified database
Target protein sequences were respectively processed using ChopNSpice with the following
parameters: spice species was H. sapiens; spice site was KX; spice mode was once per fragment;
include unmodified fragments in output; enzyme was trypsin (Lys/Arg, do not cleave at Pro);
allow up to 2 protein miscleavages; allow up to 0 miscleavage in the “spice sequence”; output
formatting was FASTA (single protein sequence); mark all cleaved sites (“J”); retain comments in
FASTA format without line breaks in FASTA output. Lys-C was selected as the enzyme for Ub,
NEDD8, ISG15 and ATG8. Trypsin (Lys/Arg, do not cleave at Pro) was selected as the enzyme
for SUMO1, SUMO2, SUMO3 and FAT10. The spice sequence was itself for each Ub/Ubl.
Identification of Ub/Ubl conjugation sites using X!Tandem
All non-decoy spectra were resubmitted to search against the modified database using
X!Tandem with the following parameters: (1) [-10, 10] ppm and [-0.4, 0.4] Da for peptide- and
fragment-mass tolerance; (2) [X]|[J] as cleavage site and 0 missed cleavage allowed; (3)
15.994915@M and 57.021464@C as variable and fixed modifications, respectively.
1.2 Trypanosoma cruzi experimental dataset
1.2.1 Standard database search
The Trypanosoma cruzi experimental dataset were searched using X!Tandem with the
following parameters: (1) [-2, 4] Da and [-0.4, 0.4] Da for peptide- and fragment-mass tolerance; (2)
trypsin cleavage at both termini and two missed cleavage allowed; (3) 15.994915@M and
57.021464@C as variable and fixed modifications, respectively; (4) a FASTA database of
Trypanosoma cruzi (target+decoy, total 91,648 sequences).
1.2.2 Identification of Ub/Ubl conjugation sites using the proposed workflow
Create a combinatorial database
Target protein sequences were processed using MchopNSpice with the following parameters:
spice
species
was
custom;
spice
site
was
KX;
spice
sequence
was
TPQELGMEDDDVIDAMVEQTGG; spice mode was once per fragment; include unmodified
fragments in output; enzyme was trypsin (Lys/Arg, do not cleave at Pro); allow up to 2 protein
miscleavages; allow up to 0 miscleavage in the “spice sequence”; output formatting was FASTA
(single protein sequence); mark all cleaved sites (“J”); retain comments in FASTA format without
line breaks in FASTA output.
Non-target protein sequences were digested using MchopNSpice with the following
parameters: spice species was none; spice mode was once per fragment; include unmodified
fragments in output; enzyme was trypsin (Lys/Arg, do not cleave at Pro); allow up to 2 protein
miscleavages; output formatting was FASTA (single protein sequence); mark all cleaved sites
(“J”); retain comments in FASTA format without line breaks in FASTA output.
A combinatorial fasta database was created by combining the modified target protein
sequences and the digested non-target protein sequences.
Identification of Ub/Ubl conjugation sites using UblSearch
All MS/MS spectra were resubmitted to search against the created combinatorial database
using UblSearch with the following parameters: (1) [-2, 4] Da and [-0.4, 0.4] Da for peptide- and
fragment-mass tolerance; (2) [X]|[J] as cleavage site and 0 missed cleavage allowed; (3)
15.994915@M and 0.0000001@K as variable modifications, 57.021464@C as fixed modification.
1.2.3 Identification of Ub/Ubl conjugation sites using the ChopNSpice method
Create a modified database
Target protein sequences were processed using ChopNSpice with the following parameters:
spice
species
was
custom;
spice
site
was
KX;
spice
sequence
was
TPQELGMEDDDVIDAMVEQTGG; spice mode was once per fragment; include unmodified
fragments in output; enzyme was trypsin (Lys/Arg, do not cleave at Pro); allow up to 2 protein
miscleavages; allow up to 0 miscleavage in the “spice sequence”; output formatting was FASTA
(single protein sequence); mark all cleaved sites (“J”); retain comments in FASTA format without
line breaks in FASTA output.
Identification of Ub/Ubl conjugation sites using X!Tandem
All MS/MS spectra were resubmitted to search against the modified database using X!Tandem
with the following parameters: (1) [-2, 4] Da and [-0.4, 0.4] Da for peptide- and fragment-mass
tolerance; (2) [X]|[J] as cleavage site and 0 missed cleavage allowed; (3) 15.994915@M and
57.021464@C as variable and fixed modifications, respectively.
Supplementary Fig. 1. Theoretical fragmentation patterns of the linearized branched form and
cross-linked form of a SUMO1-modified peptide. (a) The linearized branched form of the
SUMO1-modified peptide. Theoretical fragmentation of the linearized branched peptide would
produce fragmentation ions similarly to a linear peptide. The sequence between the N-terminus and
the modified lysine residue of the target peptide would produce incorrect fragmentation ions. (b)
The cross-linked form of the SUMO1-modified peptide. Both the target peptide and the remnant
would produce fragmentation ions during CID.
Supplementary Fig .2. Workflow of UblSearch. 1) Identifying all the candidate peptides for a given
MS/MS spectrum. UblSearch finds all the candidate peptides within a given mass tolerance,
including linear and cross-linked peptides, from the created combinatorial database for each
MS/MS spectrum. 2) Generating a theoretical fragment pattern for each candidate peptide. For the
linear peptides, their theoretical fragment patterns are the same as normal database searches. For the
cross-linked peptide, UblSearch initially considers the remnant as the variable modification on the
miscleaved lysine residue within the target peptide (e.g., K2 or K7). A new fragmentation model
for the Ub/Ubl conjugation peptides is used then to generate correspondingly theoretical fragment
ions from the target peptides and the remnants (Supplementary Fig. 1B). 3) Scoring the candidate
peptides and calculating the expectation value for the peptide with the highest score. UblSearch
uses the X!Tandem scoring scheme to find the peptide matching best with the given MS/MS spectra
and calculates the expectation value of the peptide identification.
Supplementary
Fig.
3.
Mass
Spectrum
ID
583
was
successfully
matched
with
MQIFVK[Ub_LysC]TLTGK by the improved workflow. Four of the top 5 peaks and 33% of the
total intensity of all peaks in the spectrum were successfully matched with the fragment ions
generated from the cross-linked form.
Related documents