Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BIOINFORMATICS ORIGINAL PAPER Vol. 26 no. 1 2010, pages 98–103 doi:10.1093/bioinformatics/btp610 Systems biology Methods for combining peptide intensities to estimate relative protein abundance Brian Carrillo1 , Corey Yanofsky1 , Sylvie Laboissiere2 , Robert Nadon2,3 and Robert E. Kearney1,∗ of Biomedical Engineering, McGill University, 2 McGill University and Genome Quebec Innovation Centre Proteomics Platform and 3 Department of Human Genetics, McGill University, Montreal, Canada 1 Department Received on September 17, 2009; revised on October 15, 2009; accepted on October 19, 2009 Advance Access publication November 5, 2009 Associate Editor: Burkhard Rost ABSTRACT Motivation: Labeling techniques are being used increasingly to estimate relative protein abundances in quantitative proteomic studies. These techniques require the accurate measurement of correspondingly labeled peptide peak intensities to produce high-quality estimates of differential expression ratios. In mass spectrometers with counting detectors, the measurement noise varies with intensity and consequently accuracy increases with the number of ions detected. Consequently, the relative variability of peptide intensity measurements varies with intensity. This effect must be accounted for when combining information from multiple peptides to estimate relative protein abundance. Results: We examined a variety of algorithms that estimate protein differential expression ratios from multiple peptide intensity measurements. Algorithms that account for the variation of measurement error with intensity were found to provide the most accurate estimates of differential abundance. A simple Sum-ofIntensities algorithm provided the best estimates of true protein ratios of all algorithms tested. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online. 1 INTRODUCTION Labeling techniques are an increasingly popular means to measure relative protein abundances for studies that need to capture the dynamics of cellular proteomes and/or to define differences between the proteomes of biological samples. Current technologies generally use a workflow in which: (1) Particular amino acids (or classes of amino acids) of proteins (ICAT/SILAC) or peptides (iTRAQ) are labeled to distinguish the different samples in the resulting spectra. (2) The labeled samples are mixed and the resulting peptides identified in a bottom-up approach. (3) Peak pairs corresponding to the different labeled versions of the same peptides are located and their intensities measured. ∗ To 98 whom correspondence should be addressed. (4) The relative intensities of the peak pairs are used to estimate the relative abundances of the peptides (Gygi, et al., 1999; Han et al., 2001; Ong et al., 2002; Ross et al., 2004; Schulze and Mann, 2004; Stewart and Figeys, 2001). Recent developments in labeling technology have improved the ionization efficiencies of labeled peptides resulting in labels that are purer and more similar chemically (Gygi et al., 1999; Hansen et al., 2003; Kirkpatrick et al., 2005; Li et al., 2003). At the same time, improvements in mass spectrometer technology have resulted in: higher resolving powers, leading to better separation of labeled peaks; faster collection cycles, leading to the fragmentation of more peptides per unit time; and enhanced fragmentation, leading to better mass spectra and/or more peptide identifications (Alan, 1998; Loboda, 2000; Syka et al., 2004). Algorithmic advances have also improved the quality of protein quantification; however, as yet there has been little consideration as to how to best combine the information from different peptides to predict the protein abundance (Anderle et al., 2004; Ishihama et al., 2005; Kyriacos, 2006). Many current schemes, both open and closed source, estimate relative protein abundance by averaging the label ratios of all peptides belonging to a protein (Lin et al., 2006; Pedrioli, 2006; Shadforth, 2005; Schulze and Mann, 2004). This approach has one evident limitation; if the peptide intensities are noisy, the estimated relative protein abundance will depend upon which labels are selected as the numerator and denominator (See Supplementary Material A). Clearly, analysis results should not depend on such arbitrary decisions. There have been attempts to use more sophisticated algorithms, Andreev et al. presented an algorithm that incorporates peptide intensity and quality of identification (Andreev et al., 2006) and showed that removing peptides with low identification scores or outlying ratios improved quantification but did not demonstrate its effectiveness. Griffin et al. (2007) mention the use of signal intensities to ‘weight’ iTRAQ reporter ratios, but did not explain how it is used or document its effectiveness. MacCoss et al. (2003) used linear-least squares to combine information from multiple peptides but again did not demonstrate its effectiveness. One must remember that intensity measurements from mass spectrometers with counting detectors will display random errors due to counting statistics. Thus, their intensity values can be © The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] [12:28 2/12/2009 Bioinformatics-btp610.tex] Page: 98 98–103 Methods for combining peptide intensities to estimate relative protein abundance expected to follow a Poisson distribution f (n ) = λn e−λ n! to indicate the intensity of the j-th peptide of the i-th protein with label k. We generated a data set simulating a labeling experiment as follows: (1) where f (n) is the probability of detecting n ions in an interval and λ is the average number of ions (i.e. the noise-free intensity value) for the interval. (See Supplementary Material B for full derivation). The variance of the Poisson distribution is equal to its mean λ, consequently data with Poisson statistics violate the constant variance assumption of least-squares regression. Moreover, most peptide to protein algorithms use only relative intensity information and do not make explicit use of the absolute peak intensity, the key indicator of the reliability of the ratio estimate. We believe that these intensity dependent effects must be accounted for explicitly when combining multiple measurements. This article addresses the question of how to best account for Poisson statistics when using data from multiple peptides to estimate relative abundance of proteins. The article is developed in three parts. Part I demonstrates that peptide intensities observed in a Q-TOF instrument do indeed follow a Poisson distribution. Part II presents a simulation study that examines different ways of combining peptide information and evaluates the accuracy of the resulting protein abundance estimates. Our results show that methods that combine absolute peptide intensities prior to estimating relative abundances are most accurate. Part III concludes the article by presenting experimental results that confirm the findings of the simulation study. (1) The peptide intensities measured from a protein with label 1 were simulated by sampling seven values from Poisson distributions with a randomly selected mean intensities (λi ), less than some predetermined maximum (λmax ) p1i,1..7 = λi,1..7 (3) (2) The peptide intensities measured from the protein with label 2 were simulated by sampling a corresponding set of values from Poisson distributions having mean values a multiple (α = fold-change) larger than the first set. To give: p2i,1..7 = α×λi,1..7 (3) In some cases, where noted, an additional outlier peptide was generated having a random intensity and fold change (β), to simulate a misidentified peptide. p1i,8 = λi,8 ; 2.1 (5) (5) Steps 1 through 5 were repeated for different predetermined maximum intensities (step 1). The fold change for each simulated protein pair was estimated using six peptide-to-protein algorithms: (1) Average of the ratios: The intensity ratios for all peptides belonging to the same protein, including possible outliers, were computed and their arithmetic mean used to estimate protein abundance. 1 pi,j n p2i,j n 1 (6) j=1 METHODS Part I It is difficult to estimate the statistical distributions of peptide intensity in most LC/MS experiments because the peptide intensity changes continuously with the gradient. Consequently, we ran a separate experiment in which a solution containing the peptide fibrinogen was injected directly into a Q-TOF mass spectrometer. The direct injection bypassed the usual rise and fall of peptide concentration in a typical LC/MS experiment. This allowed us to acquire 1000 fragmentation spectra sequentially under stationary conditions. We analyzed these using in-house algorithms to extract the intensity and area of 12 prominent b and y ions in each of the 1000 runs. Peak intensities of a single scan on our instruments ranged from 0 to 1000 counts while peak areas were ∼10 times the peak height; the sum of several Poisson distributions is also a Poisson distribution so the area of a peak should exhibit the same behaviour as the intensity. Consequently, we used area as well as peak intensity measures to increase the dynamic range for this part of the analysis. We also conducted a statistical simulation of the results expected from an experiment in which the same peptide is present in two samples with an abundance ratio of three. A first set of intensities were generated from a Poisson random number generator with a range of mean intensities covering the operating range of the mass spectrometer. A second set of intensities was then generated from a Poisson distribution with mean values three times those of the first set. The simulation was repeated 500 times to estimate the statistical behavior. (2) Libra ratio: An outlier rejection scheme was applied to the average of the ratios algorithm. Ratios that were greater than two standard deviations from the mean were discarded and a new average of the ratios computed (Pedrioli, 2006). (3) Linear regression: Standard linear regression was used to fit a line through the peak intensities, and the slope of the line (β) was used as the estimate of relative abundance. ⎡ p1i,1 ⎤ ⎡ Part II We will use the notation: pki,j (2) p2i,1 ⎤ ⎡ ε1 ⎤ ⎢ ⎥ ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢ ε2 ⎥ ⎢ pi,2 ⎥ ⎢ pi,2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ . ⎥ = ⎢ . ⎥ βi + ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ εn p1i,n p2i,n (7) (4) Principle component analysis: The set of direction axes for which the greatest variance of the data was projected onto the principal axes was determined. This may be thought of as treating the data as a multivariate normal data set and estimating the axes of the underlying ellipsoid (Duda et al., 2001). The slope of the principal axis was used as the estimate of relative abundance. (5) Sum of intensities: The intensities of all peaks for the same label and protein were summed. The ratio of the sums was the estimate of the relative abundance. n n Si = p1i,j p2i,j (8) j=1 2.2 p2i,8 = β ×λi,8 (4) Steps 1 through 4 were repeated 1000 times to gather statistical information. Ai = 2 (4) j=1 (6) Total least squares: A straight line was fit between the peak values for the two labels. However, unlike linear regression, the method recognizes that there are errors in both variables and so it minimizes 99 [12:28 2/12/2009 Bioinformatics-btp610.tex] Page: 99 98–103 B.Carrillo et al. the orthogonal distance between the points and the best fit line (Huffel and Vandewalle, 1991). In addition, the last four algorithms were applied in conjunction with an outlier rejection scheme. This scheme weighted the data using a ‘Fair’ weighting scheme where intensity values were divided by their distance to the current ‘best’ fit line (Coleman et al., 1980). The process was repeated until the change in the slope of the “best fit” line was less than a set threshold (10−6 ). The estimates from the six algorithms were compared with the known fold change to determine estimation errors. 2.3 Part III To validate the results of the simulations, the six algorithms were tested using data from a standard set of eight proteins, prepared as follows. Eight proteins were mixed together using half of a double Latin square design (as depicted in Table 1), creating 16 different mixes. Each mix was recreated four more times to give five identical sets of 16 mixes. Each set was analyzed by Mass Spectrometry sequentially with the order of analysis of the mixes randomized within each set. Each sample was analyzed twice, once by MS/MS and once by MS. Sample preparation (other than noted above), liquid chromatography, mass spectrometry and preliminary data processing (Mascot and in house algorithms) are described in detail by Bell et al. (2009). Search results from all MS/MS runs were pooled and used to compile a list of identified peptides that originated from the eight proteins of interest. The retention time and m/z ratios of these peptides were then used by in-house algorithms to extract peak intensity information from the MS runs. The data extracted included the intensity of the first three isotopes of each peptide peak as well as the intensity of the signal in a neighbourhood (±0.07Th and ±15s) around each peak. This yielded three matrices (15×31) of intensity data that were matched in acquisition time ensuring that major sources of variability (errors caused by retention time differences, sample handling errors, etc.) were equal. The second and third matrices were shifted in the m/z axis by an amount determined by the peptide charge (1/2Th for a doubly charged peptide, 1/3Th for a triply charged peptide, etc.). The intensity of the three matrices should therefore differ only by their isotopic ratios and measurement errors. Note that because of the different mixes and repeated runs, any one peptide was present in the data set as many as 80 times with a wide range of intensities. The data were analyzed as follows: (1) A data point (p11 ) was selected randomly from the matrix of the base isotope [M], of a peptide from a MS run (solid yellow arrow on the left in Fig. 1). Using the position (row, column) in the matrix of p1 , a second data point (p21 ) was selected from the matrix of the second isotope [M+1] (dashed yellow arrow); this matching of data points ensures that the retention times are identical and that both points arise from the same relative position in the peak shape. A second set of two points (p12 and p22 ) were selected from the same peptide matrices (purple arrows in Fig. 1) in a similar manner. The second set shares the same underlying ratio, but at a different intensity. The two sets are thus considered p21..2 = α×p11..2 (9) where the fold-change, α, was determined by the isotopic ratio of the particular peptide. The ‘true’ fold-change (α) was computed by isotopic elemental composition (Palmblad, 1999) using the peptide sequences identified by Mascot with <5% chance of being a random match (Mascot Scores > ID). (2) Step 1 was repeated using different combinations of isotope sets in step 1 ([M] versus [M+2], and [M+1] versus [M+2]) to vary the range of ratios and intensities (ratios range from 1 : 1 to 8 : 1 with ∼90% between 1 : 1 and 3 : 1). (3) Steps 1 and 2 were repeated for the 228 other peptides identified and across all 80 experiments to provide a range of ratios and intensities. (4) Each analysis was repeated 500 times to gather data for statistical analysis. The analysis was repeated with the number of matched pairs varied from 2 through 12, to evaluate the effects of using different numbers of peptides per protein. The fold-change (α) for each set was estimated using the algorithms from Part 2. All custom algorithms were written in the MATLAB programming environment and will be made available upon request. Table 1. Half a ‘double Latin square’ showing the amount (in nanograms) of each protein in each of the 16 mixes MIX 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Protein P1 P2 P3 P4 P5 P6 P7 P8 10 10 640 640 160 160 40 40 960 960 320 320 80 80 20 20 20 960 960 320 320 80 80 20 10 640 640 160 160 40 40 10 40 40 10 10 640 640 160 160 320 320 80 80 20 20 960 960 80 20 20 960 960 320 320 80 40 10 10 640 640 160 160 40 160 160 40 40 10 10 640 640 80 80 20 20 960 960 320 320 320 80 80 20 20 960 960 320 160 40 40 10 10 640 640 160 640 640 160 160 40 40 10 10 20 20 960 960 320 320 80 80 960 320 320 80 80 20 20 960 640 160 160 40 40 10 10 640 Fig. 1. The three matrices of the first three isotopes of a peptide. The solid arrows show two data points selected randomly from the monoisotopic matrix; the dashed arrows indicate the ‘matched’ data points selection selected from the [M+1] matrix. 100 [12:28 2/12/2009 Bioinformatics-btp610.tex] Page: 100 98–103 Methods for combining peptide intensities to estimate relative protein abundance Fig. 3. Mean value bracketed by ±1 SD of the ratio of two simulated Poisson variables whose mean values differ by a factor of three as a function of intensity. Fig. 2. Ion intensity data from the fragmentation of the Fibrinogen peptide (EGVNDNEEGFFSAR) injected directly into a q-TOF mass spectrometer. (A) Mean and variance of peaks co-vary linearly as expected for a Poisson distribution. (B) Coefficient of variation varies inversely as the square-root of the mean intensity. 3 3.1 RESULTS Part I Figure 2 shows the results derived from the analysis of the first experiment. Figure 2A shows that the mean and variance of intensity and area co-vary; the superimposed linear fit demonstrates that the relation is close to linear, as expected for a Poisson distribution. Because of this linear increase in variance, the standard deviation will increase more slowly than the mean and consequently, the relative noise will decrease as the mean increases. This is demonstrated in Figure 2B that shows the coefficient of variation (σ/µ) of the peak intensity as a function of the mean. As expected, it follows an inverse square-root relation. Thus, low intensity (low count) peaks will be relatively more noisy than high intensity peaks. Consequently, the accuracy of any peptide ratio will depend upon the absolute intensity of the two peaks under consideration. Figure 3 shows the results of the simulation study that examined how the mean and standard deviation of the abundance ratio estimates vary as a function of intensity. At low intensities, the mean value differs from the true value and the standard deviation is large. The mean of the estimate converges to the true value as the intensity increases while the standard deviation decreases. This demonstrates, clearly, that the accuracy with which a peptide abundance ratio can be estimated depends strongly on the absolute intensities. Fig. 4. Error versus intensity for six peptide-to-protein algorithms as a function of intensity. The sum of intensities and total least squares algorithms provide the most accurate estimates. Data were generated from 1000 simulated experiments in which seven peptides were observed from proteins with an true fold change of 2. 3.2 Part II Figure 4 shows the mean fold-change errors as a function of maximum intensity for simulations with no outliers and a true foldchange of two. All algorithms behaved similarly; the error decreased as the intensity increased. However, the sum of intensities and the total least squares methods consistently performed best, giving a fold change error that was ∼10% lower than that of the other methods. In practice, it is quite likely that quantitative data will have outliers. For example, if a protein is identified using nine peptides, each with 95% confidence of identification, there will be a 37% chance that there will be at least one outlier. Figure 5 summarizes 101 [12:28 2/12/2009 Bioinformatics-btp610.tex] Page: 101 98–103 B.Carrillo et al. Fig. 5. Error versus intensity for six algorithms with one outlier. Data were generated from 1000 simulated experiments in which seven peptides were observed from proteins with an true fold change of 2 with one outlier peptide with random intensity and fold change. Fig. 6. Effects of an outlier rejection on the sum of intensities and total least squares algorithms. Outlier rejection decreases the errors particularly at high mean intensities. the results of simulations with one outlier. As expected, the error in all algorithms increased; nevertheless, the sum of intensities and the total least squares methods still gave the best results with errors ∼10% lower than the other methods. Adding the outlier rejection scheme significantly improved both the sum of intensities and the total least squares algorithms. Figure 6 shows that the error for both algorithms was reduced by ∼20% by the addition of outlier rejection when one outlier was present. 3.3 Part III Figure 7 shows the average fold-change error as a function of the number of peptides used to estimate the fold-change. The trend Fig. 7. Validation experiment results showing average error versus number of peptides for five peptide-to-protein algorithms. Total least squares and sum-of-intensities perform best irrespective of the number of peptides used. Fig. 8. Experimental cumulative distributions of ratio errors showing larger populations with less error for the sum of intensities and total least squares algorithms. The dashed lines highlight the 10% error mark; 50% of the Linear Regression estimates and 63% of the sum-of-intensities estimates show errors <10%. was the same for all methods; the error decreased as the number of peptides increased. However, as with simulations, the sum of intensities and total least squares algorithms performed best. Note that these averages incorporate results from a range of intensities and fold changes and so provide only a rough metric of performance. To provide a better understanding of error behavior, Figure 8 shows the cumulative probability of the number of data sets as a function of error for the different methods. This figure shows that linear regression estimated the abundances with an error of <10% for 50% of the data sets. In contrast, the sum of intensities and total least squares performed substantially better giving estimates with <10% 102 [12:28 2/12/2009 Bioinformatics-btp610.tex] Page: 102 98–103 Methods for combining peptide intensities to estimate relative protein abundance error for 63% of the data sets. Moreover, for 90% of the peptides examined, the sum of intensities and total least squares algorithms estimated the abundance ratios most accurately (their curves are the farthest left in Fig. 8). 4 DISCUSSION This article examines how to combine abundance information from multiple peptides to predict protein abundance most accurately. Our results demonstrate that the sum of intensities and total least squares algorithms provided the most accurate estimates of protein abundance for a wide range of simulated and experimental conditions. The commonly used average of ratios algorithm consistently provided estimates with the highest errors. The linear regression algorithm performed poorly consistently; this is likely because the algorithm did not account for errors on both axes. Principal component analysis performed midrange to all the algorithms, with or without the presence of outliers. A novel feature of the analysis presented here is that it does not rely on an exact knowledge of the relative abundances of the proteins in the different mixtures. These were varied experimentally by carefully combining different amounts of pure proteins manually and will inevitably lead to some (unknown) error in the relative abundance. Rather, we opted to use the relative intensity of the different isotopic peaks for the same peptide as a means of evaluating accuracy. By selecting peptides of different mass we were able to investigate a range of abundance ratios while data from the different mixtures resulted in a wide range of intensities. The sum of intensities and total least squares both performed well with the performance of the sum of intensities being marginally better. The sum of intensities has the added advantage of being computationally simple; it required 1/16 of the time needed by the total least squares method. The sum of intensities algorithm is a simple, computationally efficient algorithm that (i) incorporates intensity, or data quality into the estimates of fold change; (ii) does not fail when only few peptides are identified (no singularities); and (iii) consistently provides the best estimates of protein quantification. We therefore believe that the sum of intensities algorithm is optimal amongst the algorithms tested. The results presented should be generally applicable to all mass spectrometers using counting detectors including all time-of-flight and most ion trap mass spectrometers. ACKNOWLEDGMENTS The authors are indebted to John Bergeron for securing funding (GQ/GC, CFI, VRQ). We would like to acknowledge the McGill University and Genome Quebec Innovation Centre Proteomics Platform, and in particular Pascal Pleynet, Daniel Boismenu, Line Roy and Nathalie Hamel for technical support and Benoit Houle for securing GC/GQ-funding. Funding: Genome Quebec; Genome Canada; Natural Sciences and Engineering Research Council of Canada; Canadian Institutes of Health Research (to R.K.); McGill Majors fellowship (to B.C.). Conflict of Interest: none declared. REFERENCES Anderle,M. et al. (2004) Quantifying reproducibility for differential proteomics: noise analysis for protein liquid chromatography-mass spectrometry of human serum. Bioinformatics, 20, 3575–3582. Andreev,V.P. et al. (2006) New algorithm for 15 N/14 N Quantitation with LC-ESI-MS Using an LTQ-FT Mass Spectrometer. J. Proteome Res., 5, 2039–2045. Bell,A.W. et al. (2009) A HUPO test sample study reveals common problems in mass spectrometry-based proteomics. Nat. Methods, 6, 423–430. Coleman,D. et al. (1980) A system of subroutines for iteratively reweighted least squares computations. ACM Trans. Math. Softw., 6, 327–336. Duda,R.O. et al. (2001) Pattern Classification. 2nd ed. New York; Chichester, England: Wiley. Griffin,T.J. et al. (2007) iTRAQ reagent-based quantitative proteomic analysis on a linear ion trap mass spectrometer. J. Proteome Res., 6, 4200–4209. Gygi,S.P. et al. (1999) Quantitative analysis of complex protein mixtures using isotopecoded affinity tags. Nat. Biotechnol., 17, 994–999. Han,D.K., et al. (2001) Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat. Biotechnol., 19, 946–951. Hansen,K.C. et al. (2003) Mass spectrometric analysis of protein mixtures at low levels using cleavable 13C-isotope-coded affinity tag and multidimensional chromatography. Mol. Cell. Proteomics, 2, 299–314. Ishihama,Y. et al. (2005) Exponentially Modified Protein Abundance Index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol. Cell. Proteomics, 4, 1265–1272. Kirkpatrick,D.S. et al. (2005) The absolute quantification strategy: a general procedure for the quantification of proteins and post-translational modifications. Methods, 35, 265–273. Leptos,K.C. et al. (2006) MapQuant: Open-source software for large-scale protein quantification. Proteomics, 6, 1770–1782. Li,J. et al. (2003) Protein profiling with cleavable isotope-coded affinity tag (cICAT) reagents: the yeast salinity stress response. Mol. Cell. Proteomics, 2, 1198–1204. Lin,W.T. et al. (2006) Multi-Q: a fully automated tool for multiplexed protein quantitation. J. Proteome Res., 5, 2328–2338. Loboda,A.V. et al. (2000) A tandem quadrupole/time-of-flight mass spectrometer with a matrix-assisted laser desorption/ionization source: design and performance. Rapid Commun. Mass Spectrometry, 14, 1047–1057. MacCoss,M.J. et al. (2003) A correlation algorithm for the automated quantitative analysis of shotgun proteomics data. Anal. Chem., 75, 6912–6921. Marshall,A.G. et al. (1998) Fourier transform ion cyclotron resonance mass spectrometry: a primer. Mass Spectrom. Rev., 17, 1–35. Ong,S.-E. et al. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics, 1, 376–386. Palmblad,M. (1999) isotop_fs. Available from: http://www.ms-utils.org/isotop_fs.html. Pedrioli,P. (2006) Available from: http://tools.proteomecenter.org/Libra.php (last accessed date August 22, 2006). Ross,P.L. et al. (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell Proteomics, 3, 1154–1169. Schulze,W.X. and Mann,M. (2004) A novel proteomic screen for peptide-protein interactions. J. Biol. Chem., 279, 10756–10764. Shadforth,I. et al. (2005) i-Tracker: for quantitative proteomics using iTRAQTM. BMC Genomics, 6, 145. Stewart,I.I. and Figeys,T.T.D. (2001) O18 labeling: a tool for proteomics. Rapid Commun. Mass Spectrom., 15, 2456–2465. Syka,J.E.P. et al. (2004) Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc. Natl Acad. Sci. USA, 101, 9528–9533. Van Huffel,S. and Vandewalle,J. (1991) The Total Least Squares Problem: Computational Aspects and Analysis. Philadelphia: Society for Industrial and Applied Mathematics. 103 [12:28 2/12/2009 Bioinformatics-btp610.tex] Page: 103 98–103