Examples 4, 5 & 6: Describe LCMS2 Experiments |
2024-05-16 |
Examples 4 and 5 are more “real world” examples. They describe an MS2 analysis that will be loaded into LabKey Server. These examples use the file Example4.mzXML in the XarTutorial directory. This file is the output of an LCMS2 run, a run which started with a physical sample and involved some sample preparation steps. The mzXML file is also the starting input to a peptide search process using X!Tandem. The search process is initiated by the Data Pipeline, and produces a file named Example4.pep.xml. When loaded into the database, the pep xml becomes an MS2 Run with its associated pages for displaying and filtering the list of peptides and proteins found in the sample. It is sometimes useful to think of the steps leading up to the mzXML file as a separate experiment run from the peptide search analysis of that run, especially if multiple searches are run on the same mzXML file. The Data Pipeline follows this approach.
To load both experiment runs, follow these steps.
The peptide search process may take a minute or so. When completed, there should be a new experiment named “Default experiment for folder”. Clicking on the experiment name should show two runs belonging to it. When graphed, these two runs look like the following
Example 4 Run (MS2) Summary View |
XarTutorial/Example4 (Default) Summary View |
The connection between the two runs is the Example4.mzXML file. It is the output of the run described by Example4.xar.xml. It is the input to a search run which has a xar.xml generated by the data pipeline, named XarTutorial\xtandem\Default\Example4.search.xar.xml. LabKey Server knows these two experiment runs are linked because the marked output of the first run is identified as a starting input to the second run. The file Example4.mzXML is represented in the xar object model as a Data object with a DataFileUrl property containing the path to the file. Since both of the runs are referring to the same physical file, there should be only one Data object created. The ${AutoFileLSID} substitution template serves this purpose. ${AutoFileLSID} must be used in conjunction with a DataFileUrl value that gives a path to a file relative to the xar.xml file’s directory. At load time the LabKey Server loader checks to see if an existing Data object points to that same file. If one exists, that object’s LSID is substituted for the template. If none exists, the loader creates a new Data object with a unique LSID. Sharing the same LSID between the two runs allows LabKey Server to show the linkage between the two, as in Figure 4.
Example4.xar.xml
The OutputDataLSID of the step that produces the mzXML file uses the ${AutoFileLSID} template. A second parameter, OutputDataFileTemplate, gives the relative path to the file from the xar.xml’s directory (in this case the file is in the same directory). |
<exp:Protocol rdf:about="${FolderLSIDBase}:ConvertToMzXML"> <exp:Name>Convert to mzXML</exp:Name> <exp:ApplicationType>ProtocolApplication</exp:ApplicationType> <exp:MaxInputMaterialPerInstance>0</exp:MaxInputMaterialPerInstance> <exp:MaxInputDataPerInstance>1</exp:MaxInputDataPerInstance> <exp:OutputMaterialPerInstance>0</exp:OutputMaterialPerInstance> <exp:OutputDataPerInstance>1</exp:OutputDataPerInstance> <exp:OutputDataType>Data</exp:OutputDataType> <exp:ParameterDeclarations> <exp:SimpleVal Name="ApplicationLSIDTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.ApplicationLSID" ValueType="String">${RunLSIDBase}:${InputLSID.objectid}.DoConvertToMzXML</exp:SimpleVal> <exp:SimpleVal Name="ApplicationNameTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.ApplicationName" ValueType="String">Do conversion to MzXML</exp:SimpleVal> <exp:SimpleVal Name="OutputDataLSIDTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataLSID" ValueType="String">${AutoFileLSID}</exp:SimpleVal> <exp:SimpleVal Name="OutputDataFileTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataFile" ValueType="String">Example4.mzXML</exp:SimpleVal> <exp:SimpleVal Name="OutputDataNameTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataName" ValueType="String">MzXML file</exp:SimpleVal> </exp:ParameterDeclarations> </exp:Protocol> |
Example4.search.xar.xml
Two of the protocols in the generated xar.xml use the ${AutoFileLSID} template including the Convert to PepXml step shown. But note here that the OutputDataFileTemplate parameter is declared but does not have a default value. |
<exp:Protocol rdf:about="${FolderLSIDBase}:MS2.ConvertToPepXml"> <exp:Name>Convert To PepXml</exp:Name> <exp:ApplicationType>ProtocolApplication</exp:ApplicationType> <exp:MaxInputMaterialPerInstance>0</exp:MaxInputMaterialPerInstance> <exp:MaxInputDataPerInstance>1</exp:MaxInputDataPerInstance> <exp:OutputMaterialPerInstance>0</exp:OutputMaterialPerInstance> <exp:OutputDataPerInstance>1</exp:OutputDataPerInstance> <exp:ParameterDeclarations> <exp:SimpleVal Name="ApplicationLSIDTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.ApplicationLSID" ValueType="String">${RunLSIDBase}::MS2.ConvertToPepXml</exp:SimpleVal> <exp:SimpleVal Name="ApplicationNameTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.ApplicationName" ValueType="String">PepXml/XTandem Search Results</exp:SimpleVal> <exp:SimpleVal Name="OutputDataLSIDTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataLSID" ValueType="String">${AutoFileLSID}</exp:SimpleVal> <exp:SimpleVal Name="OutputDataFileTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataFile" ValueType="String"/> <exp:SimpleVal Name="OutputDataNameTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataName" ValueType="String">PepXml/XTandem Search Results</exp:SimpleVal> </exp:ParameterDeclarations> <exp:Properties/> </exp:Protocol>
|
The StartingInputDefintions use the ${AutoFileLSID} template. This time the files referred to are in different directories from the xar.xml file. The Xar load process turns these relative paths into paths relative to the Pipeline root when checking to see if Data objects already point to them. |
<exp:StartingInputDefinitions> <exp:Data rdf:about="${AutoFileLSID}"> <exp:Name>Example4.mzXML</exp:Name> <exp:CpasType>Data</exp:CpasType> <exp:DataFileUrl>../../Example4.mzXML</exp:DataFileUrl> </exp:Data> <exp:Data rdf:about="${AutoFileLSID}"> <exp:Name>Tandem Settings</exp:Name> <exp:CpasType>Data</exp:CpasType> <exp:DataFileUrl>tandem.xml</exp:DataFileUrl> </exp:Data> <exp:Data rdf:about="${AutoFileLSID}"> <exp:Name>Bovine_mini.fasta</exp:Name> <exp:CpasType>Data</exp:CpasType> <exp:DataFileUrl>..\..\databases\Bovine_mini.fasta</exp:DataFileUrl> </exp:Data> </exp:StartingInputDefinitions>
|
The ExperimentLog section of this xar.xml uses the optional CommonParametersApplied element to give the values for the OutputDataFileTemplate parameters. This element has the effect of applying the same parameter values to all ProtocolApplications generated for the current action. |
<exp:ExperimentLog> <exp:ExperimentLogEntry ActionSequenceRef="1"/> <exp:ExperimentLogEntry ActionSequenceRef="30"> <exp:CommonParametersApplied> <exp:SimpleVal Name="OutputDataFileTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataFile" ValueType="String">Example4.xtan.xml</exp:SimpleVal> </exp:CommonParametersApplied> </exp:ExperimentLogEntry> <exp:ExperimentLogEntry ActionSequenceRef="40"> <exp:CommonParametersApplied> <exp:SimpleVal Name="OutputDataFileTemplate" OntologyEntryURI="terms.fhcrc.org#XarTemplate.OutputDataFile" ValueType="String">Example4.pep.xml</exp:SimpleVal> </exp:CommonParametersApplied> </exp:ExperimentLogEntry> <exp:ExperimentLogEntry ActionSequenceRef="50"/> </exp:ExperimentLog> |
After using the Data Pipeline to generate a pep.xml peptide search result, some users may want to integrate the two separate connected runs of Example 4 into a single run that starts with a sample and ends with the peptide search results. Example 5 is the result of this combination.
Summary View |
Details View |
The protocols of example 5 are the union of the two sets of protocols in Example4.xar.xml and Example4.search.xar.xml. A new run protocol becomes the parent of all of the steps.
Note that the ActionDefinition section has one unusual addition: the XTandemAnalyze step has both the MS2EndToEndProtocol (first) step and the ConvertToMzXML steps as predecessors. This is because it takes as inputs 3 files: the mzXML file output by step 30 and the tandem.xml and bovine_mini.fasta files. The latter two files are not produced by any step in the protocol and so must be included in the StartingInputs section. Adding step 1 as a predecessor is the signal that the XTandemAnalyze step uses StartingInputs. |
<exp:ProtocolActionDefinitions> <exp:ProtocolActionSet ParentProtocolLSID="${FolderLSIDBase}:MS2EndToEndProtocol"> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:MS2EndToEndProtocol" ActionSequence="1"> <exp:PredecessorAction ActionSequenceRef="1"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:SamplePrep" ActionSequence="10"> <exp:PredecessorAction ActionSequenceRef="1"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:LCMS2" ActionSequence="20"> <exp:PredecessorAction ActionSequenceRef="10"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:ConvertToMzXML" ActionSequence="30"> <exp:PredecessorAction ActionSequenceRef="20"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:XTandemAnalyze" ActionSequence="60"> <exp:PredecessorAction ActionSequenceRef="1"/> <exp:PredecessorAction ActionSequenceRef="30"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:ConvertToPepXml" ActionSequence="70"> <exp:PredecessorAction ActionSequenceRef="60"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:MarkRunOutput" ActionSequence="1000"> <exp:PredecessorAction ActionSequenceRef="70"/> </exp:ProtocolAction> </exp:ProtocolActionSet> </exp:ProtocolActionDefinitions> |
Some types of MS2 experiments involve combining two related samples into one prior to running LCMS2. The original samples are dyed with different markers so that they can be distinguished. Example 6 demonstrates how to do this in a xar.xml.
Details View |
There are two different tagging protocols for the two different dye types.
The PoolingTreatment protocol has a MaxInputMaterialPerInstance of 2 and an Output of 1
|
<exp:Protocol rdf:about="${FolderLSIDBase}:TaggingTreatment.Cy5"> <exp:Name>Label with Cy5</exp:Name> <exp:ProtocolDescription>Tag sample with Amersham CY5 dye</exp:ProtocolDescription> … </exp:Protocol> <exp:Protocol rdf:about="${FolderLSIDBase}:TaggingTreatment.Cy3"> <exp:Name>Label with Cy3</exp:Name> … </exp:Protocol> <exp:Protocol rdf:about="${FolderLSIDBase}:PoolingTreatment"> <exp:Name>Combine tagged samples</exp:Name> <exp:ProtocolDescription/> <exp:ApplicationType/> <exp:MaxInputMaterialPerInstance>2</exp:MaxInputMaterialPerInstance> <exp:MaxInputDataPerInstance>0</exp:MaxInputDataPerInstance> <exp:OutputMaterialPerInstance>1</exp:OutputMaterialPerInstance> <exp:OutputDataPerInstance>0</exp:OutputDataPerInstance> … </exp:Protocol> |
Both tagging steps are listed as having the start protocol (action sequence =1) as predecessors, meaning that they take StartingInputs.
The pooling step lists both the tagging steps as predecessors. |
<exp:ProtocolActionDefinitions> <exp:ProtocolActionSet ParentProtocolLSID="${FolderLSIDBase}:Example_6_Protocol"> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:Example_6_Protocol" ActionSequence="1"> <exp:PredecessorAction ActionSequenceRef="1"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:TaggingTreatment.Cy5" ActionSequence="10"> <exp:PredecessorAction ActionSequenceRef="1"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:TaggingTreatment.Cy3" ActionSequence="11"> <exp:PredecessorAction ActionSequenceRef="1"/> </exp:ProtocolAction> <exp:ProtocolAction ChildProtocolLSID="${FolderLSIDBase}:PoolingTreatment" ActionSequence="15"> <exp:PredecessorAction ActionSequenceRef="10"/> <exp:PredecessorAction ActionSequenceRef="11"/> </exp:ProtocolAction> |
The two starting inputs need to be assigned to specific steps so that the xar records which dye was applied to which sample. So this xar.xml uses the ApplicationInstanceCollection element of the ExperimentLogEntry to specify which input a step takes. Since there is only one instance of step 10 (or 20) there is one InstanceDetails block in the collection. The InstanceInputs refer to an LSID in the StartingInputDefinitions block. Instance-specific parameters could also be specified in this section. |
<exp:StartingInputDefinitions> <exp:Material rdf:about="${FolderLSIDBase}:Case"> <exp:Name>Case</exp:Name> </exp:Material> <exp:Material rdf:about="${FolderLSIDBase}:Control"> <exp:Name>Control</exp:Name> </exp:Material> </exp:StartingInputDefinitions>
<exp:ExperimentLog> <exp:ExperimentLogEntry ActionSequenceRef="1"/> <exp:ExperimentLogEntry ActionSequenceRef="10"> <exp:ApplicationInstanceCollection> <exp:InstanceDetails> <exp:InstanceInputs> <exp:MaterialLSID>${FolderLSIDBase}:Case</exp:MaterialLSID> </exp:InstanceInputs> </exp:InstanceDetails> </exp:ApplicationInstanceCollection> </exp:ExperimentLogEntry> <exp:ExperimentLogEntry ActionSequenceRef="11"> <exp:ApplicationInstanceCollection> <exp:InstanceDetails> <exp:InstanceInputs> <exp:MaterialLSID>${FolderLSIDBase}:Control</exp:MaterialLSID> </exp:InstanceInputs> </exp:InstanceDetails> </exp:ApplicationInstanceCollection> </exp:ExperimentLogEntry> <exp:ExperimentLogEntry ActionSequenceRef="15"/> |
The file LungAdenocarcinoma.xar.xml is a fully annotated description of an actual study. It uses export format because it includes custom properties attached to run outputs. Properties of generated outputs cannot currently be described using log format.