Can the CPAS work with the TPP results?

CPAS Forum (Inactive)
Can the CPAS work with the TPP results? ljy241  2008-01-09 17:49
Status: Closed
 
Hi all,
I obtained some huge interact-prot.xml files from TPP(v.3.4,running under XP) that were difficult to open with browser.I wonder whether can I open them with the CPAS directly and how to perform.I have installed the CPAS on a PC.
Thanks.

John
 
 
jeckels responded:  2008-01-11 15:23
Hi John,

Yes, you can load externally-generated TPP results in CPAS.

Even though the actual sequence of steps is different, you can get a more detailed description for many of these steps in the CPAS tutorial at https://www.labkey.org/Project/home/cpas/tutorial/begin.view

First, create a new project (Manage Site->Create Project, you might need to click on the Show Admin link first) of type MS2. In that project, click on the Setup button under the Data Pipeline and point it at the directory on the server's hard drive that has the TPP results.

Then, click on the Process and Upload Data button in the Data Pipeline. Navigate the file system, if needed, to find the prot.xml. (Note - you might need to rename the file so that it has a .prot.xml extension so that CPAS recognizes it) and click on the Import ProteinProphet button. In the background, CPAS will upload the FASTA file, pep.xml, and prot.xml into its database. You can click on the job in the Data Pipeline to view the log file to check on its progress.

I realize this is rather confusing, but add the "MS2 Runs" web part to your folder. The default list (which is labeled as "MS2 Runs") only shows runs that have associated experimental metadata, which is not present when importing TPP result files.

At that point, you should be able to click on your run to view it.

Thanks,
Josh
 
ljy241 responded:  2008-01-11 19:53
Hi Josh,

Thanks for your help. I had a try under your instruction while failed to upload the interact.prot.xml file. The error was in the attachment file(ZS3_interact.prot.xml.log).
I created a project of type MS2 and only transfered the interact.prot.xml to the folder. But failed to upload it. I don't know what's the problem.

Thanks,

John
 
adam responded:  2008-01-12 08:48
It appears that your FASTA file has multiple protein sequences with the same name. The TPP and pepXML/protXML files all reference proteins by name, so duplicate names lead to ambiguous results. I'm not sure why the TPP components don't validate the FASTA file to detect this, but CPAS does, which is why it prevents the FASTA file and the run from loading. (We are looking at providing a more useful error message in the next version of CPAS, by the way.)

You will need to remove or disambiguate the duplicate protein names in your FASTA file. You may want to re-run your search, ProteinProphet, etc., since any results pointing to duplicate names could end up linking to the wrong sequence.
 
ljy241 responded:  2008-01-13 01:38
Hi Adamr,
That's the problem. Now I have uploaded the protXML file successfully. But how to view and export the results. Is it enough to just upload the protXML file? I can see nothing in the "MS2 Runs" web part. I navigated the "Experiments" and clicked the "Upload Experiment" and had a try to upload the tandem.xml(from TPP),but failed.
How can I view the PeptideProphet and ProteinProphet Results or compare the results of different runs.
Thanks.

John
 
jeckels responded:  2008-01-13 10:38
John,

Unfortunately, the default MS2 run list doesn't show runs that are imported this way. Here's how to get a list that will show them:

Go to your project. Near the bottom of the page, there will be a drop-down. Choose "MS2 Runs" from the list and click on the Add Web Part button.

This should add a different list that will include your run.

Thanks,
Josh
 
ljy241 responded:  2008-01-15 00:28
Hi all,
Thank all of you. Now I can view and export the results. I have another question that how can I filter the results, for example, the probability>0.9, and compare the results from different interact.prot.xml files.
Thanks.
John
 
jeckels responded:  2008-01-15 09:42
John,

Glad to hear you're viewing your results.

There's a short video that will help show you how to filter and compare your data. Since you've already got your data loaded, you might want to skip to the "Analyzing MS2 Results" section.

https://www.labkey.org/Wiki/home/CPAS/tutorial/page.view?name=CPASTutorialVideo

There's also a text version of the same basic steps:

https://www.labkey.org/wiki/home/CPAS/tutorial/page.view?name=viewSingleRun

There's more extensive documentation here as well:

https://www.labkey.org/wiki/home/Documentation/page.view?name=ms2Runs

Thanks,
Josh
 
ljy241 responded:  2008-01-19 04:11
Hi Josh,
Thanks for your detailed instruction. I still have some simple questions. What's the differences between Group Probability and Proteins Probabilty? I set the filter that GroupProbability(Prob)>=0.9 and compared the results of two different runs, but there were still some values that less than 0.9 appeared in the Prob Grid. And the total number of proteins of one run was also different from that filtered alone. Take the Demo runs for example, we can get 53 proteins after filtering(Prob>=0.9) from MM_clICAT13.pep.xml, while 69 proteins appeared when it compared with MM_clICAT12.pep.xml under the same filter. What's the reason?
Thanks,
John
 
ljy241 responded:  2008-01-19 04:37
Hi,
I want to compare the results from different runs with high confidence that all of the proteins probability, for example, is more than 0.9. I found that the total proteins number was different when Comparing with different runs. And it was difficult to open the web when comparing more than 2 runs(I run the comparison on a PC).
 
jeckels responded:  2008-01-21 16:56
John,

I understand your confusion - I've been considering changing the current behavior and would welcome your feedback.

Currently, we'll show a protein in the list if it meets the filter criteria in any of the runs. We'll also show its values for any other runs it appears in, even if they don't meet the criteria. The theory is that it's useful to see that it was detected in the other runs even though they might have had a lower confidence.

The unfortunate side effect is that the overlap summary doesn't distinguish between proteins that meet the criteria for a particular run and ones that don't, which is why the number of proteins associated with a run might change depending on what you're comparing it with.

I've been pondering switching it so that the grid only shows proteins in a run if they meet the filter criteria. This would also affect the summary at the top.

I hope this clarifies the current behavior. Let me know if you have further questions.

Thanks,
Josh
 
ljy241 responded:  2008-01-23 21:53
Hi Josh,

I know it is a dilemmatic selection and I think it is better to be decided by the user themselves. Besides, when I used the Query to customize the grid view, the too many items that did make me confused. Is there any instruction for the items? In addition, for the ordinary users, take me for example, just would like to compare the results from different runs and get the same or different proteins and their corresponding information wiht a high confidence, it is no need to appear so many items and it is also difficult to run them in a PC. So I think it is better to divide it into two parts that for simple users and advanced users.
Thanks,
John