IPI repeated in Search Engine Comparison

CPAS Forum (Inactive)
IPI repeated in Search Engine Comparison tvaisar  2007-11-16 12:50
Status: Closed
 
One more for today.
I did Sequest search/TPP/CPAS upload on 27 samples. When I do Compare > Search Engine Protein Assignment using a filter requiring one tryptic end and PeptideProbability 0.9+ I get one IPI showing twice (out of 275). For unknown reason this protein is shown as identified on two separate lines (each time in different samples).
I am attaching an Excel export to document the problem.

This is really bizarre since all the samples were searched against the same version of IPI database in one sequence and uploaded at the same time.

In contrast if I use the Query(beta) comparison (filtered by Run Prob >0.98) it is different IPIs (Best Name) which are duplicated in the output. In this case the Best Name is duplicate but SequenceID and FirstIPI as well as Description are different. (Excel export also attached).
This one I can understand although it still makes comparisons difficult. The first one I have no explanation for.

Any suggestion what may be the reasons?

Thanks,

Tomas
 
 
Peter responded:  2007-11-16 18:10
Tomas,

It would help to know if there are two different sequence Id's for the same protein, the one that is duped. If you hover your mouse over one of the links in the grid you will see the seqId in the link address. Are the seqids different on the two different lines? can you see any differences when you go into the protein view behind the two different links ? this might give clues as to how you got in that state.
 
tvaisar responded:  2008-01-18 12:36
Hi Peter,

I would like to restart this discussion this time with a little bit different (although related) problem:

I see duplicated IPIs when I compare two datasets searched against different versions of IPI database.
It is best shown on TRYP_PIG.
Both entries have unique SeqIDs.
However, I see no difference in the sequence as in the original FASTA file, except the header line.
If I display the two sequences in the CPAS I see two identical pages except in one case I get Organism assigned. However the Description is the same for both (some cross talk happened).
I see that we have multiple versions of both databases in the system???

Would you have any explanation for the problem and if yes, would there be a way to fix it?

Thanks,

Tomas
 
Peter responded:  2008-01-30 18:22
Tomas, I'm pretty sure this is the same set of issues discussed in your other thread at
https://www.labkey.org/announcements/home/CPAS/support/thread.view?rowId=2118 and for which I sent you some sql scripts.

If you think these are different problems let me know, otherwise I'll close out this thread and leave the other one active.

Peter