U.S. flag

An official website of the United States government

PMC Full-Text Search Results

Items: 4

1.
Figure 2

Figure 2. From: Flexible Data Analysis Pipeline for High-Confidence Proteogenomics.

Proteogenomics pipeline, as displayed in the TOPPAS workflow editor. The different stages of the pipeline are indicated using colored boxes. Additional output nodes, which would be used in practice to capture intermediate results at different stages, have been omitted for simplicity. The input file nodes 1–5 contain the following data: 1, MS2 spectra (mzML files); 2, combined target–decoy sequences (FASTA); 3, contaminant sequences (FASTA); 4, known protein sequences (FASTA); and 5, presumed noncoding sequences (FASTA).

Hendrik Weisser, et al. J Proteome Res. 2016 Dec 2;15(12):4686-4695.
2.
Figure 3

Figure 3. From: Flexible Data Analysis Pipeline for High-Confidence Proteogenomics.

Data retention throughout the pipeline. The bars show the numbers of “data elements” (spectra, PMSs, and peptides) under consideration as these numbers decrease from the start (left) to the end (right) of the proteogenomics pipeline. In detail, the bars represent the following (node numbers refer to the TOPPAS workflow in ): “MS2 spectra”, input MS2 spectra in the C-HPP testis data set; “Mascot/MS-GF+ PSMs (all)”, spectra that generated PSMs using either search engine; “Mascot/MS-GF+ (1% PEP)”, PSMs after PSM-level filtering (node 15); “Consensus”, PSMs after ConsensusID (node 16); “Filter: contaminants”, PSMs after filtering for contaminants (node 18); “Filter: known proteins (exact)”, PSMs after filtering for exact matches to known proteins (node 20); “Filter: known proteins (approx.)”, PSMs after filtering for approximate matches to known proteins (final set; node 22); and “Novel peptides”, distinct novel peptides identified by the final set of PSMs.

Hendrik Weisser, et al. J Proteome Res. 2016 Dec 2;15(12):4686-4695.
3.
Figure 1

Figure 1. From: Flexible Data Analysis Pipeline for High-Confidence Proteogenomics.

Schematic overview of the OpenMS proteogenomics workflow. Based on a comprehensive sequence database, tandem mass spectra from large proteomic data sets are searched in a competitive target–decoy approach using two search engines, Mascot and MS-GF+. The search results are rescored using Percolator and filtered in multiple stages according to stringent quality criteria. During this process, starting from a large number of spectra and initial PSMs, the set of retained PSMs is refined further and further until in the end, only high-confidence PSMs from novel peptides remain. These are exported and passed on to genome annotators. In a manual review process, novel peptides and other sources of evidence are integrated, in some cases yielding new insights in the form of novel genome annotations.

Hendrik Weisser, et al. J Proteome Res. 2016 Dec 2;15(12):4686-4695.
4.
Figure 4

Figure 4. From: Flexible Data Analysis Pipeline for High-Confidence Proteogenomics.

Reannotation of OTTHUMG00000019887 based on proteogenomic analysis. (A) This locus was present in GENCODE v20 as a lincRNA model, and it is currently categorized in this way by RefSeq (orange model) based on mRNA AK056723.1 (brown model) and given the official HGNC gene symbol LINC00961. Furthermore, an equivalent model was generated and classified as a lncRNA by the RNA-Seq-based PLAR pipeline developed by Hezroni et al. (purple-outlined model). GENCODE have now converted this model to protein coding (UTRs in red; CDS in green) based on proteogenomic evidence in combination with evolutionary conservation. The conserved region is well resolved by PhyloCSF, with this track being taken from genome.ucsc.edu. Peptide [QEASLFTGPVR] is marked (red triangle). (B) The 75 aa human CDS shows conservation in eutherian mammals, although not outside this group based on available genome alignments. “T. Devil” is Tasmanian devil, and “flying fox” is specifically the black flying fox Pteropus alecto.

Hendrik Weisser, et al. J Proteome Res. 2016 Dec 2;15(12):4686-4695.

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...
Support Center