run X!Tandem without TPP

CPAS Forum (Inactive)
run X!Tandem without TPP tvaisar  2007-10-03 09:35
Status: Closed
 
Guys,

is it possible to run X!Tandem search (with k-score) but without the TPP processing of the search results? I have very specific sample which does not seem to like the TPP. I am seeing big differences between running standalone X!Tandem and running the same mzXML file through CPAS pipeline. Some of it could be parameters, but I'd like to eliminate the TPP first.

Thanks,

Tomas
 
 
brendanx responded:  2007-10-03 10:15
CPAS does leave the original X!Tandem output in a file named <basename>.xtan.xml. Perhaps this would help with your analysis. I would check your parameters, if you are seeing radically different "expect" values for sequences across 2 runs (1 from CPAS, and 1 manual). Note that raw X!Tandem output files contain the full set of parameters used (default + input). Looking at these is where I would start.

It is not possible to load raw X! Tandem results into CPAS. Loading raw search output formats for each search engine would be a lot of work. We chose to standardize on loading pepXML, and its successor when it arrives. So, you must at least run X! Tandem and then Tandem2XML to get data into CPAS. At this time there is no parameter to allow turning off PeptideProphet and ProteinProphet. If you use the automated pipeline in CPAS, the Prophet programs will run on the data, but you need not run them manually. You can still load a pepXML file into CPAS that has no Prophet data.

If this is important for UW, we can certainly add a parameter to turn off running the Prophet programs. Although, I must admit I don't see this helping a great deal in the analysis you describe, since I believe these programs are simply additive, and do not change values already calculated by X!Tandem-Tandem2XML.

--Brendan
 
tvaisar responded:  2007-10-03 10:34
Hi Brendan,

thanks for the response - what I am actually seeing is radically different number of peptides identified. My thinking was that having low number of good matches which is the case with my sample results in bogus probabilities and then filtering out any good matches. But it could be wrong. I guess if I remove the p=0.05 cut-off then I should see everything. I will give it a shot. Will also double check the parameters. These are the samples with N-terminal modification on every peptide and lysine residue I needed to be able to add XX@[ is the native X!Tandem notation for it.

Will let you know how it turns out.

Tomas
 
brendanx responded:  2007-10-03 10:42
To remove Prophet filtering you would add the parameter:

<note type="input" label="pipeline prophet, min probability">0</note>

https://www.labkey.org/Wiki/home/Documentation/page.view?name=pipelineParams

You do need an updated Tandem2XML to handle the X!Tandem to pepXML conversion correctly for this, as we have discussed before. Tandem2XML versions including and after 3.2.2 have all known terminal modification bugs fixed.

Good luck.

--Brendan
 
tvaisar responded:  2007-10-03 12:10
Brendan,

where can I get the Xtande2XML.exe of the version you refer to? The one with the CPAS 2.2 is older and the one you sent me before is older as well. Short of downloading the new TPP (which I heard from Brian has some significant issues) I am not sure how to get it.

Thanks,

Tomas
 
brendanx responded:  2007-10-03 12:53
 
tvaisar responded:  2007-10-03 15:26
Hi Brendan,

so I tried to mimic the standalone Xtandem params as close as I could. But in the end I got an error on processing the Xinteract:

03 Oct 2007 15:02:08,984 INFO : Tandem2XML output
03 Oct 2007 15:02:08,984 INFO : =======================================
03 Oct 2007 15:02:08,984 INFO : running: Tandem2XML 20070928_ApoA1_Cathepsin_FTCID_01.xtan.xml 20070928_ApoA1_Cathepsin_FTCID_01-raw.pep.xml
03 Oct 2007 15:02:09,140 INFO : xinteract output
03 Oct 2007 15:02:09,156 INFO : =======================================
03 Oct 2007 15:02:09,156 INFO : running: xinteract -Opt -nR -x20 -p0 -X-n[,4 -nK,4 -F100 "-dC:\Documents and Settings\Tomas\My Documents\NTerm\NTermHDL\CathepsinS\SLU\ApoAI070928" -N20070928_ApoA1_Cathepsin_FTCID_01.pep.xml 20070928_ApoA1_Cathepsin_FTCID_01-raw.pep.xml
03 Oct 2007 15:02:09,203 INFO :
03 Oct 2007 15:02:09,203 INFO : xinteract (TPP v3.0 SQUALL rev.2, Build 200705070941(Win32))
03 Oct 2007 15:02:09,203 INFO : error, cannot parse input file and
03 Oct 2007 15:02:09,203 ERROR: Failed running xinteract.

First - I suspect that the two modifications for the Xpress might have caused it but then how can you run the Xpress if you have label at two sites?

Is there a way to recover the search results and rerun just the xinteract? Or do I have to run the whole search again?

Thanks Tomas

P.S. I attach the Xtandem.xml for your reference.
 
brendanx responded:  2007-10-03 16:07
Few! Your tandem.xml sure has a lot to look at. One issue certainly might be the line:

  <note label="pipeline quantitation, residue label mass" type="input">4@[,4@K</note>

First you would need to have something matching this in the line:

  <note label="residue, potential modification mass" type="input">15.99@M,(109.048119-105.021464)@[,(109.048119-105.021464)@K</note>

I am guess you want to change these two line to:

  <note label="residue, potential modification mass" type="input">15.99@M,4.026655@[,4.026655@K</note>
  <note label="pipeline quantitation, residue label mass" type="input">4.026655@n,4.026655@K</note>

The CPAS pipeline, unfortunately, does not yet convert 4@[ to anything XPress will recognize. You have to use 4@n as a work-around.

X!Tandem may support the expression format you are using, but Tandem2XML certainly does not. X!Tandem sure does support a lot of different ways for specifying modifications. Not fun to try to cover them all in Tandem2XML.

Those look to be your biggest issues in this case, but there are some other strange parameters:
- Cut default parameters
- Cut taxonomy
- Cut protein, cleavage [N|C]-terminal mass change (defaults are better)
- Really want a ions?
- Lots of conditioning parameters, if you are going to specify no conditioning

Hope the XPress thoughts help.

--Brendan
 
tvaisar responded:  2007-10-03 16:19
Thanks Brendan,

I'll give it a shot. I am still getting oriented in the tandem.xml and all the parameters and their meaning. That's why I pretty much used full tandem.xml parameters to override defaults.
Should I use the 4@n (or 105.021@n) though-out for the modification at the N-terminus of a peptide?

The conditioning is another terra incognita for me so sorry for having them there.

Will post here how it worked out.

Tomas
 
jeng responded:  2007-10-03 18:02
Tomas,

I just finally read through this thread and I'm pretty sure what Brendan wrote above is correct. For Xpress quantitation, you want to use

  <note label="pipeline quantitation, residue label mass" type="input">4.026655@n,4.026655@K</note>

assuming the mass difference between your 'light' and 'heavy' isotopes is 4.027. This is assuming that you have run the Tandem search with a static modification ("residue, modification mass") of 105.021464 to both n-terminus and lysine as well as a variable modification ("residue, potential modification mass") of 4.02665 to both n-terminus and lysine.

Hope that makes sense (and I hope I got right).

- Jimmy
 
tvaisar responded:  2007-10-04 12:31
OK,

I made some progress here - at least I can get the xinteract run - I need to run it from the command line in the work directory (../xtandem/../xxxxx.work and it actually runs if I correct the command line problems - specifically in the interpretation of the Xpress parameters the modification at two residues did not get put right into the command line - a space was added between -nn,4.02 and -nK,4.02 and also then before -F100. You can see it in the log file lines I put into the previous post.
Then I looked into the xxx.xtan.xml, raw-pep.xml and into the processesed pep.xml files and in all the cases the modifications seem to be inserted there properly (at least as far as I can tell). However, when I upload the result into the CPAS the the modification at the N-terminus does not show up and is not considered when you look at the MSMS spectrum view (and matching of peaks). Besides all PeptideProphet probabilities are bogus (negative -2, -3), which I suspect is because there is very few positive identifications (which is expected).
I attach the 3 "pep" xml files and a screenshot of the xinteract run.

Also the list of peptides differs significantly from a run on standalone XTandem (as I mentioned earlier).

Not sure where to go from here. I am not concerned about the probabilities and would be happy with k-score or E-value, what I am after is easy comparison of multiple samples and relative quantitation (Xpress).

Any suggestions would be welcome.

Thanks,

Tomas
 
jeng responded:  2007-10-04 13:55
Note quite sure what nature of suggestions you're looking for. Given that the PeptideProphet probabilities are bogus, I would filter by e-value. Just keep in mind that there's no one magical score, and corresponding core cutoff, where there's a black and white delineation of correct and incorrect identifications. So keep that in mind when you're doing your downstream analysis (and hopefully there's enough replicate data so that you're not relying on a small sample size to come to whatever conclusions you want these data to lead to).
 
tvaisar responded:  2007-10-04 22:31
Couple comments - I can get the TPP to process without problem and get reasonable probabilities if I do not use k-score. I can run through the pipeline without error if I include only single site modification for the Xpress - once the two are there it seems that the parameter from the tandem.xml file is not properly interpreted - space is inserted between the two-X-n,4.02 -K,4.02 parameters. If I run xinteract with Xpress and all the parameters from the command line it runs fine.

Finally, when I copied all the input parameters from the standalone Xtandem into CPAS pipeline tandem.xml, I get very similar result in the CPAS run with apparently reasonable peptideprophet probabilities. I will have to compare in detail. However adding k-scoring changes results dramatically and the peptideprophet probabilities are again -2,-3 etc.

Tomas
 
brendanx responded:  2007-10-05 08:51
Okay, sounds like there is a bug in the mini-pipeline generating the command line for xinteract with complex XPress parameters. Sorry about that. I'll have a look, and maybe try to fix the handling of [ for terminal modifications as well.

I have opened a bug in our issues list to track this:

https://www.labkey.org/issues/home/developer/issues/details.view?issueId=4258

Hopefully I can give you a patch to try early next week.

--Brendan
 
tvaisar responded:  2007-10-05 09:27
Thanks Brendan,

for the other issue the k-score - I have a feeling I must be putting in some weird parameters, which confuse it and make the TPP generate those weird peptide probabilities. Could you refresh my memory how to specify use of the k-score?

What I put in there was:

<note label="scoring, algorithm" type="input">k-score</note>
   <note label="spectrum, use conditioning" type="input">no</note>
   <note label="scoring, minimum ion count" type="input">1</note>

Could it interfere with some other parameters?

Thanks,

Tomas
 
brendanx responded:  2007-10-05 09:49
The -1, -2, -3 probabilities just indicate that PeptideProphet was not able to find an acceptable model for the corresponding charge (1, 2, 3). It has been our experience that the X! Tandem native scoring does better at creating a distribution that can be modeled with small samples. We use it for our demo example of CPAS, and find that it demos better than k-score for the miniature sample we use.

Using larger test data sets, however, it really looks like k-score does significantly better... but if you can't get PeptideProphet to model your data with it, then maybe X! Tandem native is a better choice?

--Brendan
 
tvaisar responded:  2007-10-05 09:58
Thanks Brendan,

I will stick tpo native Xtandem. I will have to do quite a bit of manual curation anyhow. What about Mascot results? I know I would have to run those manually, but then I should be able to upload into CPAS as well. COrrect?

Tomas
 
brendanx responded:  2007-10-05 10:11
If you have a Mascot server set up, you can connect CPAS to it, and use our automated pipeline.

https://www.labkey.org/Wiki/home/Documentation/page.view?name=configMascot

But, if you run these manually, and end up with a Mascot pepXML file, you should be able to load it. You should also be able to load a Mascot .dat file through the pipeline. I think it will convert to pepXML for you, but not run any other tools.

Keep us posted on your progress.

--Brendan