Send to

Choose Destination
Bioinformatics. 2014 Feb 15;30(4):549-58. doi: 10.1093/bioinformatics/btt722. Epub 2013 Dec 15.

A hierarchical statistical modeling approach to analyze proteomic isobaric tag for relative and absolute quantitation data.

Author information

Clinical and Experimental Pharmacology Group, CRUK Manchester Institute, University of Manchester, Manchester M20 4BX, UK, Stem Cell and Leukaemia Proteomics Laboratory, Institute of Cancer Sciences, Manchester Academic Health Science Centre, Wolfson Molecular Imaging Centre, University of Manchester, Manchester M20 3LJ, UK and Centre for Biostatistics, Institute of Population Health, University of Manchester, Oxford Road, Manchester M13 9PL, UK.



Isobaric tag for relative and absolute quantitation (iTRAQ) is a widely used method in quantitative proteomics. A robust data analysis strategy is required to determine protein quantification reliability, i.e. changes due to biological regulation rather than technical variation, so that proteins that are differentially expressed can be identified.


Samples were created by mixing 5, 10, 15 and 20 μg Escherichia coli cell lysate with 100 μg of cell lysate from mouse, corresponding to expected relative fold changes of one for mouse proteins and from 0.25 to 4 for E.coli proteins. Relative quantification was carried out using eight channel isobaric tagging with iTRAQ reagent, and proteins were identified using a TripleTOF 5600 mass spectrometer. Technical variation inherent in this iTRAQ dataset was systematically investigated.


A hierarchical statistical model was developed to use quantitative information at peptide level and protein level simultaneously to estimate variation present in each individual peptide and protein. A novel data analysis strategy for iTRAQ, denoted in short as WHATraq, was subsequently proposed with its performance evaluated by the proportion of E.coli proteins that are successfully identified as differentially expressed. Compared with two benchmark data analysis strategies WHATraq was able to identify at least 62.8% more true positive proteins that are differentially expressed. Further validated using a biological iTRAQ dataset including multiple biological replicates from varied murine cell lines, WHATraq performed consistently and identified 375% more proteins as being differentially expressed among different cell lines than the other data analysis strategies.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center