Mascot Parameters

CPAS Forum (Inactive)
Mascot Parameters slottad  2007-11-08 08:53
Status: Closed
 
We are having trouble with Mascot searches and setting the parameters using CPAS.

First, the setting for the peptide charge seems to be ignored, whether the setting is:

    <note type="input" label="mascot, peptide_charge">1+, 2+ and 3+</note>

or:

    <note type="input" label="mascot, peptide_charge">3+</note>

The setting returned in the .dat file is always:

    CHARGE=2+ and 3+

Also, for searches with taxonomic restrictions, CPAS ignores this setting:

     <note label="protein, taxon" type="input">some taxon</note>

It does not even place it in the mascot.xml associated with the search.
Changing the label to "mascot, taxon" (which seems to be more consistent) does make it appear in the mascot.xml, but has no effect on the search (does not appear in the resulting .dat file).

Douglas
 
 
Peter responded:  2007-11-16 17:46
I've had trouble matching mascot parameters in the xml to dropdown (SELECT list) values I see in the mascot dialog box. In my case it was modification names that I couldn't seem to get right.

The way I finally got it to work was to open the Mascot search parameters dialog in my browser and using the view source option on my browser. I locate the *exact* string of the parameter value I'm trying to set in my xml file and then copy and paste it into my xml section. In my case I was missing a space, and this procedure fixed it.
 
slottad responded:  2007-11-17 11:47
Assigned To: Peter
Yes, we had much the same trouble with modifications as well and used a similar solution.

The is not the case here. We are copying the exact same input values, but these parameters are being ignored altogether.


Douglas
 
wnels2 responded:  2007-11-19 06:20
Generally, what Peter is suggesting will work. For example for the taxons it must include the dots (.........taxon) exactly like the page source. If you can wait until after the Thanksgiving holiday, post your mascot XML and I can give it a try on our server.

Bill
 
slottad responded:  2007-11-21 22:31
We are using cut and paste to set the values.

Grep'ing through the source code, I am fairly certain that the taxon setting is broken:

  ---
  server/modules/ms2/src/org/labkey/ms2$ grep -rn "protein, taxon" *
  pipeline/MascotClientImpl.java:949: {"taxonomy", "protein, taxon", "search, taxonomy"},
  pipeline/MascotDefaults.xml:79: <note type="input" label="protein, taxon">All entries</note>
  pipeline/SequestDefaults.xml:70: <note type="input" label="protein, taxon">no default</note>
  pipeline/XTandemDefaults.xml:60: <note type="input" label="protein, taxon">no default</note>
  pipeline/XTandemPipelineJob.java:270: parser.setInputParameter("protein, taxon", "sequences");
  protocol/MascotSearchProtocolFactory.java:113: parser.removeInputParameter("protein, taxon");
  protocol/SequestSearchProtocolFactory.java:95: parser.removeInputParameter("protein, taxon");
  protocol/XTandemSearchProtocolFactory.java:110: parser.removeInputParameter("protein, taxon");
  ---

If you look at the files it notes that the Factory removes the taxon parameter because the PipelineJob is responsible for it. Note that for XTandem, that parameter is indeed set, however there is no corresponding line in the MascotPipelineJob.java file. This also explains why the setting is not saved in the mascot.xml file.

I am not sure what the problem is with the peptide_charge setting, I think it is broken for an entirely different reason, but it is Thanksgiving now and I am heading to bed.

Douglas
 
slottad responded:  2007-11-23 14:18
It turns out that the problem with the peptide_charge setting comes from MzXML2Search. It places a global parameter in the MGF header (CHARGE=2+ and 3+). Note that if this parameter is global in the MGF, it overrides any search settings. There is a command line option to change this, and this is what CPAS should use.

In addition, MzXML2Search has default settings for top and bottom precursor mass cutoffs. CPAS should also have a way of specifying these.


Douglas
 
Peter responded:  2007-12-04 17:22
Douglas,

I apologize for sitting on this without a response. I've entered bugs on the two issues you've diagnosed, we will look into them. Thank you very much for digging into the causes.

Peter
 
brendanx responded:  2008-03-03 21:38
Douglas,

I am beginning work on the bug Peter entered. I think I have the Mascot and Sequest pipelines fixed to correctly use the parameters:

spectrum, minimum parent m+h
spectrum, maximum parent m+h

For the .mgf file, however, I would really like to make sure I know exactly what you want me to achieve. I think you are saying that the "CHARGE=2+ and 3+" should be removed from the file header, and instead repeated for each spectrum. Is this correct?

Once I think I understand, I think I would like to post a small MGF here for you to verify it would work for your purposes. I do not believe MzXML2Search has a command argument for this, but I created that module, though in my defense the logic in question I took from the now deprecated TPP MsXML2Other.

Thanks for the bug.

--Brendan
 
brendanx responded:  2008-03-04 06:43
Douglas,

From your information I would propose completely removing the following code from MzXML2Search:

if (strcmp(options.format, "mgf") == 0)
{
    fprintf(pfOut, "COM=Conversion of %s to mascot generic\n", szXMLFile);
------ snip ----
    if (options.iCharge == 0)
    {
        // any ion block without a CHARGE attribute will be treated as 2+ and 3+
        fprintf(pfOut, "CHARGE=2+ and 3+\n");
    }
    else if (options.iCharge == options.iChargeLast)
    {
        // all blocks will have the same charge
        fprintf(pfOut, "CHARGE=%d+\n", options.iCharge);
    }
------ snip ----
}

....

------ snip ----
// don't write the charge, if 2+ and 3+, since that is the
// default specified at the top of the file.
if (options.iCharge != options.iChargeLast ||
        (options.iCharge == 0 && (iCharge != 2 || iChargeLast != 3)))
------ snip ----
    fprintf(pfOut, "CHARGE=%d+\n", iCharge);


Does this sound like what you are after? Essentially we would get rid of all "cleaverness" in writing a default charge specification at the top of the file as some sort of file space optimization, which would clearly not be great considering the spectra to which they are attached.

We could keep the "options.iCharge == options.iChargeLast" version, since in that case the user has already explicitly stated this is what is desired, but even so, you seem to indicate that this limits future overriding within the Mascot search user interface. I lean toward falling back to the simple output for everything.

Thoughts?

--Brendan
 
slottad responded:  2008-03-06 14:52
Yes, I think that is the right approach. Get rid of the defaults, keep it simple. This allows the settings to be overridden at search time.

Sorry for the belayed response, but I did not know about you post until someone else pointed it out to me. As an aside, it might be nice if you implemented a notification mechanism for when there are updates to our posts.

Douglas
 
jeckels responded:  2008-03-06 15:35
Douglas,

You should be able to set your email preferences for this discussion board from this link:

https://www.labkey.org/announcements/home/CPAS/support/emailPreferences.view?srcUrl=%2Fannouncements%2Fhome%2FCPAS%2Fsupport%2Fbegin.view%3F

Thanks,
Josh