Example: Use the Enterprise Pipeline for MS2 Searches

Documentation
This topic is has been deprecated. For the previous documentation, click here.

Using the data pipeline with ActiveMQ is not a typical configuration and may require significant customization. If you are interested in using this feature, please contact LabKey to inquire about support options.

This topic covers one example of how a project might use the data pipeline with ActiveMQ for MS2 searches. In this example, we will create a new project and configure a pipeline to use it for X!Tandem peptide searches. The enterprise pipeline is not specific to this use case.

Prerequisites and Assumptions

Assumptions

For this example, we assume:

  • All files (both sample files and result files from searches) will be stored on a Shared File System
  • LabKey Server will mount the Shared File System.
    • Some third party tools may require a Windows installation of LabKey Server, but a LabKey remote server can be deployed on any platform that is supported by LabKey Server itself.
  • Conversion of RAW files to mzXML format will be included in the pipeline processing
    • The remote server running the conversion will mount the Shared File System
  • MS2 pipeline analysis tools (X!Tandem, TPP, etc) can be executed on a remote server
    • Remote servers will mount the Shared File System
  • Use of a Network File System: The LabKey web server and remote servers must be able to mount the following resources:
    • Pipeline directory (location where mzXML, pepXML, etc files are located)
    • Pipeline bin directory (location where third-party tools (TPP, X!Tandem, etc) are located
  • MS2 analysis tools will be run on a separate server.
  • A version of Java supported by your version of LabKey Server for each location that will be running tasks.
  • You have downloaded or built from source the following files:
    • LabKey Server
    • LabKey Server Enterprise Pipeline Configuration files
Take note to complete the steps listed here. If necessary, you will also create the LabKey Tools directory where programs such as the MS2 analysis tools will be installed.
  1. Install LabKey
  2. Set up ActiveMQ
  3. Follow the general set up steps in this topic: Configure the Pipeline with ActiveMQ
  4. (Optional) Conversion Service (convert MS2 output to mzXML). Only required if you plan to convert files to mzXML format in your pipeline
  5. Provide the appropriate configuration files

Example ms2config.xml

  • Unzip the Enterprise Pipeline Configuration distribution and copy the webserver configuration file to the Pipeline Configuration directory specified in the last step (i.e. <LABKEY_HOME>/config).
  • The configuration file is ms2config.xml which includes:
    • Where MS2 searches will be performed (on a remote server or locally on the web server)
    • Where the Conversion of raw files to mzXML will occur (if required)
    • Which analysis tools will be executed during a MS2 search

Create the LABKEY_TOOLS Directory

Create the <LABKEY_TOOLS> directory on the remote server. It will contain all the files necessary to perform MS2 searches on the remote server. The directory will contain:

  • Required LabKey software and configuration files
  • TPP tools
  • X!Tandem search engine
  • Additional MS2 analysis tools

Download the Required LabKey Software

  1. Unzip the LabKey Server distribution into the directory <LABKEY_TOOLS>/labkey/dist
  2. Unzip the LabKey Server Pipeline Configuration distribution into the directory <LABKEY_TOOLS>/labkey/dist/conf
NOTE: For the next section you will need to know the paths to the <LABKEY_TOOLS>/labkey directory and the <LABKEY_TOOLS>/external directory on the remote server.

Install the LabKey Software into the <LABKEY_TOOLS> Directory

Copy the following to the <LABKEY_TOOLS>/labkey directory

  • The directory <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/labkeywebapp
  • The directory <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/modules
  • The directory <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/pipeline-lib
  • The file <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/tomcat-lib/labkeyBootstrap-X.Y.jar
Expand all modules in the <LABKEY_TOOLS>/labkey/modules directory by running:
cd <LABKEY_TOOLS>/labkey/
java -jar labkeyBootstrap-X.Y.jar

Create the Enterprise Pipeline Configuration Files

There are 2 configuration files used on the Remote Server:

  • pipelineConfig.xml
  • ms2Config.xml

Install the MS2 Analysis Tools

These tools will be installed in the <LABKEY_TOOLS>/bin directory on the Remote Server.

Test the Configuration

There are a few simple tests that can be performed at this stage to verify that the configuration is correct. These tests are focused on ensure that a remote server can perform an MS2 search.

  1. Can the remote server see the Pipeline Directory and the <LabKey_Tools> directory?
  2. Can the remote server execute X!Tandem?
  3. Can the remote server execute the Java binary?
  4. Can the remote server execute a X!Tandem search against an mzXML file located in the Pipeline Directory?
  5. Can the remote server execute a PeptideProphet search against the resultant pepXML file?
  6. Can the remote server execute the X!Tandem search again, but this time using the LabKey Java code located on the remote server?
Once all these tests are successful, you will have a working Enterprise Pipeline. The next step is to configure a new Project on your LabKey Server and configure the Project's pipeline to use the Enterprise Pipeline.

Create a New Project to Test the Enterprise Pipeline

You can skip this step if a project already exists that you would rather use.

  • Log on to your LabKey Server using a Site Admin account
  • Create a new project with the following options:
    • Project Name: PipelineTest
    • Folder Type: MS2
  • Accept the default settings during the remaining wizard panels

For more information see Create a Project or Folder.

Configure the Project to use the Enterprise Pipeline

The following information will be required in order to configure this project to use the Enterprise Pipeline:

  • Pipeline Root Directory

Set Up the Pipeline

  • In the Data Pipeline web part, click Setup.
  • Enter the following information:
    • Path to the desired pipeline root directory on the web server
    • Specific settings and parameters for the relevant sections
  • Click Save.
  • Return to the MS2 Dashboard by clicking the PipelineTest link near the top of the page.

Run the Enterprise Pipeline

To test the Enterprise Pipeline:

  • In the Data Pipeline web part, click Process and Upload Data.
  • Navigate to and select an mzXML file, then click X!Tandem Peptide Search.

Most jobs are configured to run single-threaded. The pipeline assigns work to different thread pools. There are two main ones for work that runs on the web server, each with one thread in it. The pipeline can be configured to run with more threads or additional thread pools if necessary. In many scenarios, trying to run multiple imports in parallel or some third-party tools in parallel degrades performance vs running them sequentially.

Related Topics

Was this content helpful?

Log in or register an account to provide feedback


previousnext
 
expand allcollapse all