![]() | ![]() |
Formats:
|
||||||||||||||
Capture and Analysis of Quantitative Proteomic Data 1 Faculty of Life Sciences, University of Manchester, M13 9PT, UK. 2 MBCMS, School of Chemistry, Manchester Interdisciplinary Biocentre, University of Manchester, UK. 3 School of Computer Science, Faculty of Engineering and Physical Sciences, University of Manchester, UK. 4 Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, UK. King Wai Lau: K.Lau/at/manchester.ac.uk; Andrew R Jones: Ajones/at/cs.man.ac.uk; Neil Swainston: Neil.swainston/at/manchester.ac.uk; Jennifer A Siepen: Jennifer.siepen/at/manchester.ac.uk; Simon J Hubbard: Simon.hubbard/at/manchester.ac.uk † communicating author, Tel: (+44) (0)161 3068930, Fax: (+44) (0)161 2755082, Email: Simon.hubbard/at/manchester.ac.uk The publisher's final edited version of this article is available at Proteomics. See other articles in PMC that cite the published article.Abstract Whilst the array of techniques available for quantitative proteomics continues to grow, the attendant bioinformatic software tools are similarly expanding in number. The data capture and analysis of such quantitative data is obviously crucial to the experiment and the methods used to process it will critically affect the quality of the data obtained. These tools must deal with a variety of issues, including identification of labelled and unlabelled peptide species, location of the corresponding mass spectrometry scans in the experiment, construction of representative ion chromatograms, location of the true peptide ion chromatogram start and end, elimination of background signal in the mass spectrum and chromatogram, and calculation of both peptide and protein ratios/abundances. A variety of tools and approaches are available, in part restricted by the nature of the experiment to be performed and available instrumentation. Currently, although there is no single consensus on precisely how to calculate protein and peptide abundances, many common themes have emerged which identify and reduce many of the key sources of error. These issues will be discussed, along with those relating to deposition of quantitative data. At present, mature data standards for quantitative proteomics are not yet available, although formats are beginning to emerge. Keywords: proteomics, relative quantitation, absolute quantitation, software, bioinformatics, data standards Introduction Proteomic approaches have continued to develop apace over recent years, with emphasis now firmly focused on developing the technique into more genome-wide and quantitative areas. The ability to identify and quantify the principal functional entities in the cell remains a key advantage for proteomics in post-genomic biology. As the different control points in the regulation of gene expression are uncovered, from global mechanisms at the level of chromatin structure and transcriptional control, through to translational mechanisms including protein modification and turnover, the ability to quantify protein levels accurately will assume increasing importance. It is widely acknowledged that modern mass spectrometry, coupled with online peptide separation methods and bioinformatics, can enable hundreds or thousands of peptides to be identified in a standard proteomics experiment [1]-[ 6] – the challenge is now to do this in a quantitative fashion, and translate these into protein levels. This will support the growing field of systems biology, where different concentrations of proteins determine various states of the sample being analysed and networks of interacting molecules are developed, ultimately to predict systems level phenomena. Fortunately, there have been several recent advances in the field of quantitative proteomics, so that the experimenter is now faced with a battery of techniques to choose from, all with different advantages and disadvantages. This has led to an increase in the number of laboratories able to carry out quantitative proteomics studies, and the generation of even more data, data types, and challenging data handling problems. Therefore there is a strong demand on bioinformatics approaches to capture and analyse both simple and high-throughput quantitative proteomic data sets. In this review, we briefly survey the recent advances in quantitative proteomic techniques; prior to a more detailed examination of the quantitative proteomic data capture tools and software available, and the challenges facing software developers in this area. This will cover the means for acquiring the data itself, calculating ratios and absolute values to quantify both peptides and proteins, and the general issues the field needs to address when dealing with this type of data. Finally, we briefly address the data capture issues presented by quantitative proteomic data and the implications for proteomics data standards. Quantitative proteomic methods As mentioned above, many different mass spectrometry-based quantitative proteomic methods have been in development over the last decade, which rely on a small number of fundamental techniques, usually involving some kind of standard against which to compare. Figure 1
Relative quantitative proteomic methods measure the relative abundance ratio between two or more samples and can be divided into label-based and label-free methods. In the former, isotopic labelling methods are applied to distinguish one population of proteins from another, and then the samples are mixed before chromatographic and mass spectrometric analysis. Samples are distinguished by incorporating stable isotopes of amino acids (e.g. heavy Leu, Ile, or Arg) or elements (e.g. 15N or 18O), thereby altering the mass of most or all proteins in one of the samples in a predictable fashion. These stable isotopes may be incorporated metabolically into (ideally) all proteins [7]-[ 10], in vivo, or a chemical reagent can be used to label the peptides/proteins [11]-[ 13], in vitro. Examples of the former include the popular SILAC (Stable Isotope Labelling with Amino acids in Cell culture) approach, where cells are cultured in a medium containing a heavy-amino acid, and compared to one containing the standard ‘light’ variant. Quantitation is usually achieved by comparing the MS peak intensities of a partner pair of heavy/light peptide peaks. Examples of the latter, in vitro, approaches included ICAT and iTRAQ™ whereby a modifying group is added to a certain amino acid side group (ICAT) or free amines (iTRAQ™) allowing subsets (e.g. cysteine-containing for ICAT) or all peptides to be distinguished by mass. The iTRAQ™ technique has gained particular popularity, since it offers several advantages, including the ability to multiplex several samples in one single experiment. Thus, parallel samples may be quantified in one LC-MS/MS run via a series of different specific reporter ions which are fragmented from four different isobaric tags, attached to free peptide amines from upto four samples. Researchers can therefore quantify the relative levels of a number of samples, potentially averaging data over several peptides for each protein, and this technique will soon be widely available for 8-plex samples. Different techniques offer different advantages; for example, in vivo labelling is more efficient, labelling potentially every protein in a cell, although it is often impractical for some samples (e.g. human tissue) whereas in vitro methods can be applied for any type of sample in principle. In all cases, there are common challenges for calculating quantitative values at the protein and peptide level. Should peak heights, intensities, areas or similar be used? Should ratios be calculated from the ion chromatogram, MS, or MS/MS levels? How are peptide ratios used to calculate protein levels, and should they all be used? How can the user assign quality control measures to their results? Many of these questions are not fully resolved, and we will return to them later in the review. Label free methods have continued to grow in popularity despite the non-linear response between MS signal ion intensity and peptide level for different peptide sequences. In label free methods, high-throughput ‘shotgun’ style proteomics is employed to gain large numbers of peptide/protein identifications, and the samples under study are analyzed separately. Two approaches are used in performing quantifications. The first involves ion intensity measurement and the second concentrates on spectral counting. In intensity measurement, extracted ion chromatograms are generated for each peptide ion identified in a given LC-MS run. Peak intensities are determined and ratios calculated for peptides matched in different experiments. Where a number of peptides are matched to a given protein, each peptide ratio is used to determine a measure of protein fold change across experiments [14]. This technique has a number of disadvantages, including different peptides coeluting with similar m/z values, in which case peak intensity can not be uniquely attributed to a single peptide. Also, where complex samples are separated through multiple-dimension LC, peptides may not always elute in the same SCX fraction. In spectral counting, the quantity of protein in the different samples is then estimated from the total number of MS/MS spectra matching peptides from each protein [14]-[ 17]. This relies on the correlation between protein abundance with the coverage of the protein (the fraction of unique possible peptides observed for each protein) and the number of times peptides are observed in repeat experiments. The drawback here is that both these correlations are rather approximate and peptides from different proteins and protein isoforms can recur and complicate the analysis, as well as several other issues concerning protein size and different likelihoods of observing given peptides. Although this review will concentrate on labelling-based approaches, recent work by Marcotte and colleagues [18] suggests a way forward solving some of these problems, which can even yield absolute values. Nevertheless, rarer or low abundance proteins may remain harder to quantify with these kinds of approaches. Finally, absolute quantitative methods attempt to measure the absolute protein level using a characteristic peptide unique to the corresponding protein [19]-[ 23], thereby acting as explicit external standard of known concentration against which to quantify. For example, in the AQUA technique, peptides are synthesized incorporating stable isotopes to provide a known mass offset, and these peptides are used as an internal standard. Similarly, the QconCAT method developed at the Universities of Manchester and Liverpool concatenates stable isotope labelled peptides into a recombinant protein, which is readily synthesized in bacterial cell cultures. Tryptic digests of the QconCAT proteins provide a theoretically inexhaustible, highly multiplexed set of labelled, known peptides at controllable concentration. Finally, a recent novel approach, SISCAPA [23], exploits the use of immobilised anti-peptide antibodies, used to isolate specific peptides together with stable isotopically labelled versions of the same peptides. The latter are used as a spiked internal standard in order to quantify the peptides via ESI-MS in the standard way. Again, ultimately, the quantitation is achieved at the peptide level by inferring a ratio between signals in the mass spectra, and the same intrinsic problems presented by relative quantitation techniques must be solved. A variety of different software tools are available for calculating these quantitative values for the different techniques and these will be summarised in the next section. It is our understanding that as yet, there is no single “one-stop-shop” for scientists working in this area, and that there is still a need for improved, easily accessible, simple to install tools, which cope with all the different quantitative methods as well as standard formats in which to report the results. Existing software In this section we review the open source and/or freely available tools for obtaining and calculating quantitative data for peptides and proteins. There are a large variety of excellent tools available, which are targeted in the main towards particular approaches, instruments, vendors, third-party software and/or formats [24]-[ 43]. Table 1 summarises existing freely available quantitative proteomic software. This covers the supported instruments (mass spectrometers), quantitative proteomic methods, database search engines, requisite input files, and software dependencies, as well as programming languages, supported operating systems and the corresponding website. We hope this table quickly enables an experimentalist to target the freely available software tools that they can use, depending on their mass spectrometry set-up, particular quantitative method and chosen/compatible search engine. It highlights some of the considerations that must be made when selecting tools and approaches to use. What mass spectrometry platforms are available and which is the most appropriate in terms of the experimental system in question? This is clearly the primary consideration, but then the users are often faced with the challenge of handling the data themselves. Although there are several excellent commercial products developed by the mass spectrometry vendors themselves such as ProteinPilot™ from ABI, ProteinScape™ from Bruker, BioWorks™ and Sieve™ from Thermo, and ProteinLynx Global Server™ from Waters, they may not be available to the experimentalists. While these packages cover a wide range of labelled and label-free quantitation techniques, the use of each is limited to the data of the individual mass spectrometer manufacturer. Likewise, some groups have decided to develop their own software, such as the MSQuant package from the Mann group [27], or prefer to work with different search engines than those normally supplied by vendors (ie, I-Tracker [24] was developed to work specifically with Mascot). This latter point is important, since many groups have preferences for particular database search engines, and the actual peptide identifications and attendant quality control or scores assigned will reflect on the peptides used for calculating quantitative ratios. Many of the early tools were designed principally for Sequest [32]-[ 35],[ 40], although now many additionally support the Mascot search engine [24][ 27]. Additionally, with the exception of Multi-Q [38] there is currently no freely available quantitative proteomic software which supports the open source search engine X!Tandem [44]. An additional consideration is that different tools expect different formats, and whilst moves are afoot to standardise mass spectrometry formats [45] (see Implications for Standards), not all existing machines in current labs can export the two current standards (mzData or mzXML) and converters may be required. For these reasons and others the user must choose software that is compatible. Additionally, most labs will need access to local informatics support, virtually an essential component of any serious modern proteomics group, in order to get some tools up and running due to minor incompatibilities with their own installations.
Extracting and calculating quantitative data In most cases, the ultimate goal of quantitative proteomics is to characterise changes in protein levels in an experimental system, potentially between two or more different proteomic experiments be they temporally, spatially or pathologically different. Alternatively, the process might involve absolute quantitation of precise concentrations in a cellular system, which would have great benefits for groups attempting to take systems approaches to cellular function and dynamics, where an understanding of how proteins interact and protein and metabolite levels vary is desired. Although a comprehensive, multi-state quantitative characterisation of the whole proteome is perhaps some way off, the nature of the calculations required at the protein level is already a challenge. In order to calculate such values, the approaches described here must first calculate peptide level quantitation, and that in itself could be achieved in a variety of ways from ion chromatograms through to detailed sampling of the points on the mass-to-charge scale over several peaks of a given isotopic ion series. In this section, the existing analysis methods for quantitation are decomposed into four common parts which most of the algorithms use, and hence different parts of the various algorithms can be compared in detail. These four elements include stages which relate to the choice of retention time(s) from which the MS scans may be sampled, to methods to estimate the mass-to-charge and intensity of the peptide ion, through to how to calculate the peptide ratio from the one or more MS scans, and finally how to calculate the protein ratio from the peptide ratios and addressing its significance or confidence. In Table 2 we summarize the existing analysis methods using the framework described above for the relevant quantitative tools. In the table, there are five tools relating to Sequest-based analyses for single isotope and ICAT studies [32]-[ 35],[ 40], three relating specifically to iTRAQ™ [24][ 38][ 43] and and two relating to solely 16O/18O labelling methods [37][ 41][ 42]. They were chosen because sufficient details of the algorithm have been made available in the respective publications. Apart from the ITRAQ™ tools, all these methods rely on distinguishing light and heavy peptide peaks, but they differ in the stage and detail of the experimental process in which the data is captured. For example, several of the earlier ICAT-specific tools construct single ion chromatograms relating to specific identifications from which to calculate ratios (e.g. ASAPRatio and RelEx) while others rely on the MS peaks themselves (e.g. Multi-Q). The earlier tools developed for ICAT work addressed the majority of the key problems, which relate to the observation that quantitative information is split between differently labelled peptide species (the heavy and light peptides), and that individual peptides can elute from chromatographic systems at multiple times and, for ESI, will be present in multiple charge states in the mass spectrometer. These tools can deal with single isotope (ie. 15N), ICAT or SILAC style experiments when presented with appropriately formatted data.
In all cases, the process begins with a MS/MS identification of a peptide via a given search engine, which can be quality controlled with a minimum score or confidence estimate. This is often left to users to decide. Once peptide identifications have been deemed acceptable, the ion intensity in the MS survey scan is used for quantitation. This is clearly the most appropriate signal, since the MS/MS data could be subject to a variety of variable fragmentation processes which are still not clearly understood and are therefore not strictly quantitative. As stated above, the signal in the instrument due to any given peptide will be split across several instrument scans and charge states, and the software tools attempt to reconstruct a single ion chromatogram across these different scans and m/z values. ASAPRatio considers signal integrated over three isotopic peaks, for each isotopic variant and for 4 charge states (+1 to +4). This can be done even if only a single identification is produced in the MS/MS instrument cycles. The theoretical m/z ranges are calculated and the ion signal integrated to produce up to 8 independent single ion chromatograms. Background subtraction and outlier removal are then performed prior to the calculation of an abundance ratio for each peptide. Other approaches have simplified some of these steps, calculating the peptide abundance ratio using a least-squares fit to the points representing the two ion chromatograms (corresponding to the labelled and unlabelled peptides) in the case of RelEx [35] whilst ProRata uses a Principal Components Analysis approach [32]. More recent tools adopt similar approaches to obtain peptide ratios from chromatograms, estimating the relative areas under chromatograms for the labelled/unlabelled peptides. Underpinning these estimates are two essential steps – constructing the ion chromatogram from signal intensities, and calculating the points which constitute the “start” and “end” of the ion chromatogram for that particular ion. The first step requires knowledge of exactly where the monoisotopic peak lies in the survey spectrum. This is illustrated in Figure 2
A similar choice is presented when handling the ion chromatogram. Figure 3
Finally, if there is more than one peptide ratio for a particular protein, the individual peptide values must be combined in some fashion and, potentially, the significance of the protein ratio may be estimated. ASAPRatio and RelEx use a weighted mean of the peptide ratios to calculate the protein ratio, using estimated errors for the peptide ratios in the case of ASAPRatio. This has the effect of down-weighting contributions from peptides with smaller ion chromatograms (and presumably, larger relative errors). QUIL simply takes the median value of the set of normalised peptide ratios, fitting the experimental values to a normal distribution. ProRata calculates the protein ratio from an estimated likelihood function and the significance of the protein ratio is also related to the maximum of the likelihood function. ASAPRatio, RelEx and QUIL apply statistical test to the distribution of the protein ratios to estimate the significance. Similarly, other quality control tests may also be applied to improve the results. Firstly, certain peptides will not be unambiguously assigned to a single protein and may occur in several unrelated proteins [36]. It is therefore prudent to omit these degenerate peptides from the final protein ratio calculation as they may be subject to error receiving signal from multiple protein species. This problem is further complicated when dealing with peptides from multiple protein isoforms which may have come from differently spliced gene products from the same genomic locus. The latter issue is not straightforward to handle but all the software tools have approaches to deal with these kinds of problems, either explicitly by ignoring degenerate or shared peptides, or implicitly via some statistical test to remove outliers – when sufficient data points are available to conduct the test. Many of these issues concerned with resolving multi-peptide/protein issues are well reviewed by Nesvizhskii and Aebersold [36]. Additionally, it is also worth noting that poor quality peptide identifications can be removed prior the calculation of protein ratios using tools from the Trans Proteomic Pipeline [43][ 46][ 47]. Further tools have been developed to handle this type of data including the MSQuant software which is also targeted towards stable isotope labelled samples, and had been widely used in SILAC based studies by the developers [27]. This software has recently undergone its first official release, and can cope with a variety of formats from vendors. The data processing follows the same general principles as the other tools, although not all the more sophisticated methods for estimation of ion chromatograms or significance values for peptides and proteins. This tool requires the users to have some experience with Microsoft's .NET to install. All the above were designed to work with generic stable isotope labelling techniques. However, recently, some tools have been developed which work specifically with ABI's ITRAQ™ technology, which has gained considerable popularity. This includes the tools, I-Tracker, Libra and Multi-Q. Unlike the ICAT, SILAC and 15N stable isotope approaches, the quantitative signal is not extracted from the ion chromatogram, as peptides from the different samples will not be readily distinguishable in MS survey scans or chromatograms. Instead, the signal intensity of a set of reporter ions in the MS/MS spectral range from 114 m/z to 117 m/z are used, eliminating the need to collate and process complicated ion chromatogram or MS information. I-Tracker was the first non-commercial tool developed, and allows users to calculate quantitative ratios using the widely used Mascot search engine, also incorporating the corrections for minor impurities in the reporter ions [24]. Multi-Q extends this, using a similar strategy to calculate the peptide ratios, but extends the calculations to estimate protein not just peptide ratios [38]. Finally, the Zoomquant [37] tool has been designed to support work with 16O/18O work and uses a high resolution MS scan provided by the Thermo Finnigan Deca Xp Plus LCQ mass spectrometer. It determines the area of each theoretical isotope peak and the peptide ratio is calculated from various weighted combinations of the areas under the peaks. In addition, RAAMS [41][ 42] also works with 18O labelled samples introduced during endopeptidase digestion in heavy water. RAAMS uses regression approaches to calculate the peptide ratio, taking into account the impurities of heavy isotopic material and the exchange rate of heavy and light oxygen atom of peptides. Although the number of published methods for absolute quantitation continues to grow, there are not any widely available software tools designed specifically for dealing with this data as yet. Many of the same problems encountered in relative quantitation are still relevant, and most of the existing software for relative quantification should be easily adapted for absolute quantitative purposes. The principal difference is simply that the absolute concentration of the spiked standard is known. There are now a large variety of tools to choose from when carrying out quantitative proteomics calculations, and the new user is faced with a variety of options, which require many different types of input files and are compatible with different instruments. However, to offset this, the range of freely available conversion utilities is growing, software teams adapt their code to take novel formats, and standard, internationally-supported exchange formats for protein mass spectra are now stabilising. Nevertheless, output formats for quantitative proteomic datasets for submission to repositories which can handle this data are still missing. This is examined in the next section. Implications for standards The new technologies for quantifying protein abundance present significant computational challenges, not only in terms of analysis, but also in data capture, storage and exchange. While experimental and analytical methods continue to evolve there is a clear need for data to be made publicly available to allow verification of results and potentially to enable re-analysis with new tools and algorithms. However, at present there is relatively little publicly available quantitative proteomics data. It is likely that this situation will change over the next year or two as database deposition becomes more common and data standards start to become established, which will greatly facilitate data sharing and dissemination. Indeed, journals are now beginning to enforce public deposition of data prior to publication, similar to the situation in genomics [48]. There are three main features that data standards must be able to capture: i) the raw data themselves, such as MS traces; ii) the results of analyses, such as peptide and protein identifications and associated quantification values; and iii) the metadata, such as the protocols (and their parameters) that were used to perform sample processing, mass spectrometry and data analysis. In this area, the Proteomics Standards Initiative (PSI) and the major proteomics research groups have been working to create community accepted data standards to cover each of these facets. There are two independent formats for mass spectrometry data developed as ‘standard’ exchange formats, called mzData (from PSI, http://www.psidev.info/index.php?q=node/80) and mzXML [49] (developed by the Institute for Systems Biology). These are naturally in addition to the large number of formats developed and maintained by the mass spectrometry vendors. Both mzData and mzXML are capable of representing a raw MS trace or a simple peak list, along with details of the instrument used and the instrument parameters. While converters exist to transform data from mzXML to mzData and vice versa [50], a single standard for MS data would clearly be beneficial and in fact, a merger between the two formats is anticipated in late 2007. A single standard for MS will simplify the process of assembling large data sets and will encourage academic and industrial groups to put significant resources into software development in support of the standard, for instance in data export, visualisation and analysis. These MS formats contain a simple model to define the sample on which MS is performed. In the case of quantitative proteomics, a more complex sample description may be required, for example if different reporters are associated with different samples (which may have been pooled during the process). A temporary solution for describing multiplexed samples in mzData and their storage within the PRIDE database [51] has recently been reported [52]. In the longer term, it is clear that mass spectrometry standards must support the description of rich sample information. PSI is currently developing spML (sample processing markup language, http://www.psidev.info/index.php?q=node/90), which contains models of column chromatography, capillary electrophoresis and other basic separations that may be performed in a proteomics investigation. spML will therefore be capable of describing the pre-MS stages of a proteome workflow, including the sample processing performed in studies such as ICAT, iTRAQ™, SILAC and so on. Similarly, although there is currently no established data standard for representing the results of informatics analysis on MS data, for example to store peptide/protein identifications or abundance values, PSI are active in this area. There are various proprietary data formats output by analysis software and, in many cases, scripts exist to render results, for example as Web pages. Such Web-based results are usually not amenable to further analyses without significant bioinformatics support to extract results from the original source files. For this reason, PSI is developing a vendor-independent standard for representing the results of analyses on MS data, called analysisXML (http://www.psidev.info/index.php?q=node/87). It is expected that analysisXML will stabilise towards the end of 2007, and will be capable of representing identifications and quantitative data, and details of the software and parameters used to obtain the results. It is likely that the combination of the choice of software and parameters employed has some effect on the final results, but until large collections of such metadata are available it will be difficult to perform systematic studies to investigate the effects. Indeed, it may even be impractical for a single laboratory to store every individual MS scan, which would be required to reproduce or compare approaches for some of the quantitative methodologies where the signal is captured in the integrated single ion chromotograms for the paired diagnostic peptides. Such a task is, however, being tackled by the Tranche project, which is attempting to use open source peer-to-peer file-sharing technologies to allow the huge volumes of mass spectrometry data to be stored and shared in a distributed fashion (http://www.proteomecommons.org/dev/dfs) A robust mechanism is required for integrating different file formats, such that there is an unambiguous association between the description of a starting sample (and details of any subsequent pooling of samples), the reporter ions or peaks that correspond only with a specific starting sample, and the quantification values that are calculated for each peptide. The linkage could be facilitated by FuGE [53] (the Functional Genomics Experiment model), which can describe complete experimental workflows and provides a mechanism for integrating pre-existing data files. Both analysisXML and spML have been created by extending FuGE, and mzData may adopt a FuGE-like structure in the future, which will further ease integration efforts. In summary, there are established standards in this area (mzData and mzXML), and several efforts which are likely to become more high profile over the next year (analysisXML, spML and FuGE). The existing standards alone do not cover sufficient information to make sharing quantitative proteomics data trivial but it is hoped that the forthcoming standards will become widely accepted, and will simplify the process of sharing, visualising and publishing quantitative proteomics data. However, a detailed reanalysis from the raw data may not always be possible or indeed practical in terms of the data storage requirements. Summary The field of quantitative proteomics has moved from a handful of methods to tens in the space of a few years. These technologies have used innovative analytical chemistry, developed by a few laboratories around the world, who have also developed their own software tools to support them; quite reasonably based around the instrumental set up they have available. However, more recently the range of available software tools for quantitative proteomic data analysis has grown further, with third-party developers entering the fray, and increasing the choice for user groups. Selection of appropriate software is most likely to be based on fit with instrumentation and favoured method for quantitation, which itself may be proscribed for the type of work required (in vitro vs. in vivo labelling etc). However, best practice in selecting peaks, calculating chromatograms, peptide and protein ratios and estimating significance in these values is still not entirely clear. Further work and experience will help define the best approaches needed, along with better data standards for capture and storage of the data. The latter are acutely needed in proteomics to enable improved data sharing and comparison, and to help quantitative proteomics to become a standard technique in the laboratory. Acknowledgements The authors would like to thank the BBSRC and EPSRC for funding from several grants.,including ExGen grant EGM17685 to KLW and SJH, BBSB17204 (ISPIDER) to JAS and SJH, Pedro (BBSB12407) to ARJ, SJH, and the BBSRC funded Manchester Centre for Integrative Systems Biology which supports NS. References 1. Fröhlich T, Arnold GJ. J. Neural Transm. 2006;113:973–994. [PubMed] 2. Burlingame AL, Boyd RK, Gaskell SJ. Anal. Chem. 1998;70:647R–716R. 3. Mawuenyega KG, Kaji H, Yamauchi Y, Shinkawa T, et al. J. Proteome Res. 2003;2:23–35. [PubMed] 4. Florens L, Washburn MP, Raine JD, Anthony RM, et al. Nature. 2002;419:520–526. [PubMed] 5. Lasonder E, Ishihama Y, Andersen JS, Vermunt AM, et al. Nature. 2002;419:537–542. [PubMed] 6. Peng J, Elias JE, Thoreen CC, Licklider LL, et al. J. Proteome Res. 2003;2:43–50. [PubMed] 7. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, et al. Mol. Cell Proteomics. 2002;1:376–386. [PubMed] 8. Oda Y, Huang K, Cross FR, Cowburn D, et al. Proc. Natl. Acad. Sci. U.S.A. 1999;96:6591–6596. [PubMed] 9. Conrads TP, Alving K, Veenstra TD, Belov ME, et al. Anal. Chem. 2001;73:2132–2139. [PubMed] 10. Washburn MP, Koller A, Oshiro G, Ulaszek RR, et al. Proc. Natl. Acad. Sci. U.S.A. 2003;100:3107–3112. [PubMed] 11. Gygi SP, Rist B, Gerber SA, Turecek F, et al. Nat. Biotechnol. 1999;17:994–999. [PubMed] 12. Miyagi M, Sekhar Rao KC. Mass Spec. Rev. 2006;26:121–136. [PubMed] 13. Ross PL, Huang YN, Marchese JN, Williamson B, et al. Mol. Cell Proteomics. 2004;3:1154–1169. [PubMed] 14. Old WM, Meyer-Arendt K, Aveline-Wolf L, Pierce KG, et al. Mol. Cell Proteomics. 2005;4:1487–1502. [PubMed] 15. Liu H, Sadygov RG, Yates JR., III Anal. Chem. 2004;76:4193–4201. [PubMed] 16. Gao J, Friedrichs MS, Dongre AR, Opiteck GJ. J. Am. Soc. Mass Spectrom. 2005;16:1231–1238. [PubMed] 17. Zybailov B, Coleman MK, Florens L, Washburn MP. Anal. Chem. 2005;77:6218–6224. [PubMed] 18. Lu P, Vogel C, Wang R, Yao X, et al. Nat. Biotechnol. 2007;25:117–124. [PubMed] 19. Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP. Proc. Natl. Acad. Sci. U.S.A. 2003;100:6940–6945. [PubMed] 20. Pratt JM, Pratt DM, Doherty MK, Rivers J, et al. Nat. Protocols. 2006;1:1029–1043. [PubMed] 21. Beynon RJ, Doherty MK, Pratt JM, Gaskell SJ. Nat. Methods. 2006;2:587–589. [PubMed] 22. Barr JR, Maggio VL, Patterson DG, Cooper GR, et al. Clin. Chem. 1996;(10):1676–1682. [PubMed] 23. Anderson NL, Anderson NG, Haines LR, Hardie DB, et al. J. Proteome Res. 2004;3:235–244. [PubMed] 24. Shadforth IP, Dunkley TP, Lilley KS, Bessant C. BMC Genomics. 2005;6:145. [PubMed] 25. Leptos KC, Sarracino DA, Jaffe JD, Krastins B, et al. Proteomics. 2006;6:1770–1782. [PubMed] 26. Lindell D, Jaffe JD, Johnson ZI, Church GM, et al. Nature. 2005;438:86–89. [PubMed] 27. Schulze WX, Mann M. J. Biol. Chem. 2004;279:10756–10764. [PubMed] 28. Andersen JS, Wilkinson CJ, Mayor T, Mortensen P, et al. Nature. 2003;426:570–574. [PubMed] 29. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, et al. Mol. Cell Proteomics. 2002;1:376–386. [PubMed] 30. Ong SE, Kratchmarova I, Mann M. J. Proteome Res. 2003;2:173–181. [PubMed] 31. Blagoev M, Ong SE, Kratchmarova I, Mann M. Nat. Biotechnol. 2004;22:1139–1145. [PubMed] 32. Pan C, Kora G, McDonald W, Tabb D, et al. Anal. Chem. 2006;78:7121–7131. [PubMed] 33. Han DK, Eng J, Zhou H, Aebersold R. Nat. Biotechnol. 2001;19:946–951. [PubMed] 34. Li XJ, Zhang H, Ranish JR, Aebersold R. Anal. Chem. 2003;75:6648–6657. [PubMed] 35. MacCoss MJ, Wu CC, Liu H, Sadygov R, et al. Anal. Chem. 2003;75:6912–6921. [PubMed] 36. Nesvizhskii AI, Aebersold R. Mol. Cell Proteomics. 2005;4:1419–1440. [PubMed] 37. Halligan BD, Slyper RY, Twigger SN, Hicks W. J. Am. Soc. Mass Spectrom. 2005;16:302–306. [PubMed] 38. Lin WT, Hung WN, Yian YH, Wu KP, et al. J. Proteome Res. 2006;5:2328–2338. [PubMed] 39. Shinkawa T, Taoka M, Yamauchi Y, Ichimura T, et al. J. Proteome Res. 2005;4:1826–1831. [PubMed] 40. Wang G, Wu WW, Pisitkun T, Hoffert JD, et al. Anal. Chem. 2006;78:5752–5761. [PubMed] 41. Eckel-Passow JE, Oberg AL, Therneau TM, Mason CJ, et al. Bioinformatics. 2006;22:2739–2745. [PubMed] 42. Mason CJ, Therneau TM, Eckel-Passow JE, Johnson KL, et al. Mol. Cell Proteomics. 2006 43. Keller A, Eng J, Zhang N, Li XJ, Aebersold R. Mol Syst Biol. 2005;1:0017. [PubMed] 44. Craig R, Beavis RC. Bioinformatics. 2004;20:1466–1467. [PubMed] 45. Orchard S, Hermjakob H, Apweiler R, et al. Proteomics. 2003;3:1374–1376. [PubMed] 46. Keller A, Nesvizhskii A, Kolker E, Aebersold R. Anal. Chem. 2002;74:5383–5392. [PubMed] 47. Nesvizhskii AI, Keller A, Kolker E, Aebersold R. Anal. Chem. 2003;75:4646–4658. [PubMed] 48. Carr S, Aebersold A, Baldwin M, Burlingame A, et al. Mol. Cell Proteomics. 2004;3:531–511. [PubMed] 49. Pedrioli PG, Eng JK, Hubley R, Vogelzang M, et al. Nat. Biotechnol. 2004;22:1459–1466. [PubMed] 50. Falkner JA, Falkner JW, Andrews PC. Bioinformatics. 2007;23:262–263. [PubMed] 51. Jones P, Cote RG, Martens L, Quinn AF, et al. Nucl. Acids Res. 2006;34:D659–D663. [PubMed] 52. Siepen JA, Swainston N, Jones AR, Hart SR, et al. Proteome Sci. 2007;5:4. [PubMed] 53. Jones AR, Miller M, Aebersold R, Apweiler R, et al. Nat. Biotechnol. 2007 in community consultation. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
J Neural Transm. 2006 Aug; 113(8):973-94.
[J Neural Transm. 2006]J Proteome Res. 2003 Jan-Feb; 2(1):43-50.
[J Proteome Res. 2003]Mol Cell Proteomics. 2002 May; 1(5):376-86.
[Mol Cell Proteomics. 2002]Proc Natl Acad Sci U S A. 2003 Mar 18; 100(6):3107-12.
[Proc Natl Acad Sci U S A. 2003]Nat Biotechnol. 1999 Oct; 17(10):994-9.
[Nat Biotechnol. 1999]Mol Cell Proteomics. 2004 Dec; 3(12):1154-69.
[Mol Cell Proteomics. 2004]Mol Cell Proteomics. 2005 Oct; 4(10):1487-502.
[Mol Cell Proteomics. 2005]Mol Cell Proteomics. 2005 Oct; 4(10):1487-502.
[Mol Cell Proteomics. 2005]Anal Chem. 2005 Oct 1; 77(19):6218-24.
[Anal Chem. 2005]Nat Biotechnol. 2007 Jan; 25(1):117-24.
[Nat Biotechnol. 2007]Proc Natl Acad Sci U S A. 2003 Jun 10; 100(12):6940-5.
[Proc Natl Acad Sci U S A. 2003]J Proteome Res. 2004 Mar-Apr; 3(2):235-44.
[J Proteome Res. 2004]BMC Genomics. 2005 Oct 20; 6():145.
[BMC Genomics. 2005]Mol Syst Biol. 2005; 1():2005.0017.
[Mol Syst Biol. 2005]J Biol Chem. 2004 Mar 12; 279(11):10756-64.
[J Biol Chem. 2004]Anal Chem. 2006 Oct 15; 78(20):7121-31.
[Anal Chem. 2006]Anal Chem. 2003 Dec 15; 75(24):6912-21.
[Anal Chem. 2003]Anal Chem. 2006 Oct 15; 78(20):7121-31.
[Anal Chem. 2006]Anal Chem. 2003 Dec 15; 75(24):6912-21.
[Anal Chem. 2003]Anal Chem. 2006 Aug 15; 78(16):5752-61.
[Anal Chem. 2006]BMC Genomics. 2005 Oct 20; 6():145.
[BMC Genomics. 2005]J Proteome Res. 2006 Sep; 5(9):2328-38.
[J Proteome Res. 2006]Anal Chem. 2003 Dec 15; 75(24):6912-21.
[Anal Chem. 2003]Anal Chem. 2006 Oct 15; 78(20):7121-31.
[Anal Chem. 2006]Anal Chem. 2006 Oct 15; 78(20):7121-31.
[Anal Chem. 2006]Anal Chem. 2003 Dec 15; 75(24):6912-21.
[Anal Chem. 2003]Mol Cell Proteomics. 2005 Oct; 4(10):1419-40.
[Mol Cell Proteomics. 2005]Mol Syst Biol. 2005; 1():2005.0017.
[Mol Syst Biol. 2005]Anal Chem. 2002 Oct 15; 74(20):5383-92.
[Anal Chem. 2002]Anal Chem. 2003 Sep 1; 75(17):4646-58.
[Anal Chem. 2003]J Biol Chem. 2004 Mar 12; 279(11):10756-64.
[J Biol Chem. 2004]BMC Genomics. 2005 Oct 20; 6():145.
[BMC Genomics. 2005]J Proteome Res. 2006 Sep; 5(9):2328-38.
[J Proteome Res. 2006]J Am Soc Mass Spectrom. 2005 Mar; 16(3):302-6.
[J Am Soc Mass Spectrom. 2005]Bioinformatics. 2006 Nov 15; 22(22):2739-45.
[Bioinformatics. 2006]Mol Cell Proteomics. 2004 Jun; 3(6):531-3.
[Mol Cell Proteomics. 2004]Nat Biotechnol. 2004 Nov; 22(11):1459-66.
[Nat Biotechnol. 2004]Bioinformatics. 2007 Jan 15; 23(2):262-3.
[Bioinformatics. 2007]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D659-63.
[Nucleic Acids Res. 2006]Proteome Sci. 2007 Feb 1; 5():4.
[Proteome Sci. 2007]