Search MS2 Data Via the Pipeline

2024-03-28

Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

The data pipeline searches and processes LC-MS/MS data and displays the results for analysis. For an environment where multiple users may be processing large runs, it also handles queueing and workflow of jobs.

The pipeline is used for file upload and processing throughout LabKey Server, not just the MS2 tools. For general information on the LabKey Pipeline and links to how it is used by other features, see Data Processing Pipeline. This topic covers additional MS2-specific information on the pipeline. Please contact LabKey for information about support.

Pipeline Searches

You can use the LabKey Server data pipeline to search and process MS/MS run data that's stored in an mzXML file. You can also process pepXML files, which are stored results from a search for peptides on an mzXML file against a protein database. The LabKey Server data pipeline incorporates a number of tools developed as part of the Trans Proteomic Pipeline (TPP) by the Institute for Systems Biology. The data pipeline includes the following tools:

  • The X! Tandem search engine, which searches tandem mass spectra for peptide sequences. You can configure X! Tandem search parameters from within LabKey Server to specify how the search is run.
  • PeptideProphet, which validates peptide assignments made by the search engine, assigning a probability that each result is correct.
  • ProteinProphet, which validates protein identifications made by the search engine on the basis of peptide assignments.
  • XPRESS, which performs protein quantification.

Using the Pipeline

To experiment with a sample data set, see the Discovery Proteomics Tutorial guide and the proteomics demo project.

Import Existing Analysis Results

When you already have data results available, such as those produced by an external analysis or results generated on a different server, you can use pipeline tools to analyze them.

  • Set up the pipeline root to point to your data file location.
  • Select (Admin) > Go To Module > Pipeline.
  • Click Process and Import Data.
  • Navigate to the files you want to import. Only recognized file types (see below) will be listed.
  • Select the desired file(s) using the checkboxes and click Import Data (or use the desired search protocol).
  • In the popup, select the desired protocol and click Import.
LabKey Server supports importing the following MS2 file types:
  • *.pep.xml (MS2 search results, PeptideProphet results)
  • *.prot.xml (ProteinProphet results)
  • *.dat (Mascot search results)
  • *.xar.xml (Experiment archive metadata)
  • *.xar (Compressed experiment archive with metadata)
Note that some result files include links to other files. LabKey Server will show an import action attached to the most general of the files. For example, if you have both a results.pep.xml and results.prot.xml in a directory, the server will only offer to import the results.prot.xml, which references the results.pep.xml file and will cause it to be loaded as well.