Loading Experiment Hangs: /home/Support/Inactive Forums/CPAS Forum (Inactive)

Loading Experiment Hangs

CPAS Forum (Inactive)

View Message

Loading Experiment Hangs

vensel

2006-06-20 17:21

Status: Closed

I have been trying to process mzXML files generated from wiff files by mzStar version 2.4.0 and they seem to process okay. The data are from 2-DE gel analysis so that there are 35 different samples. After running the first sample the process stalls with the Status message "LOADING EXPERIMENT"

The run seemed to be proceeding okay because:

1.X!Tandem runs and finds valid models 2. PeptideProphet and ProteinProphet are run on the data

The log file for the only sample processed is attached.

Thanks,

Bill

060517_Peach_Obenland_Tryp_A_1.log

wongch responded:	2006-06-20 17:51
The last line in the log reads: "20 Jun 2006 16:35:15,546 INFO : Loading FASTA file". If this is the first time that FASTA file is referenced, CPAS will load the sequence information into the system. This will take a little while to process. No worry. *Chee-Hong.

vensel responded:	2006-06-20 20:11
Thanks I will check it tomorrow and see what it has done. Bill

vensel responded:	2006-06-21 11:59
Okay today when I checked the progress and looked at the 'Data Pipeline' page about 1/2 of the samples were analyzed but the status was ERROR for the samples that had been run, the last line in a sample log reads: 21 Jun 2006 05:14:31,812 FATAL: Upload FAILED. There are however links to the samples that ran and the peptide data for each run are displayed. Also it seems as if it is taking a very long time as each sample represents one spot from a gel. Should these errors be generated and should this be so slow? This is version 1.4 on windows XP pro, with a dual 3.06 hyperthreaded XEON Processor, 4 gigs of memory and 800 gig hard drive. Thanks, Bill
060517_Peach_Obenland_Tryp_A_5.log

brendanx responded:	2006-06-21 12:33
Appears to be having trouble dropping a temporary table in the ProteinProphet upload code. We'll see if we can figure out what is going on there, but it is at the very end of the upload process, so, yes, your data should be there, though you may be accumulating temporary tables inside Postgres. On performance, FASTA loading is known to take a long time. We are working on it. The log you sent appears to have taken about 3.5 minutes to search and upload (mostly inside X!Tandem doing the search), which seems reasonable to me for a 182K sequence FASTA on the system you describe. Some searches (e.g. semi-cleavage, or unconstrained cleavage) take hours on quad-proc machines. For high throughput proteomics we are using a 40-50 node cluster. X!Tandem is one of the fastest search engines around, but on a single machine there is only so much you can do. If you are willing to have X!Tandem consume all your system resources while it is searching, you can tell it to search multi-threaded with the line: <note type="input" label="spectrum, threads">2</note> Or use 4, if you have hyper-threading on.

vensel responded:	2006-06-21 12:58
Thanks for looking at this. The 3.5 minutes per sample seems reasonable to me also but there are periods of up to an hour between some of the samples.

brendanx responded:	2006-06-21 13:05
Is this time captured in any of the logs? Or, does it just appear to be sitting around doing nothing between jobs? Any CPU usage?

jeckels responded:	2006-06-21 14:38
This failure happened when CPAS was trying to upload ProteinProphet data. I'm not sure exactly what caused it, but it looks like the .prot.xml file refers to a protein that CPAS doesn't think is associated with that FASTA file. I can tell you that it failed within the first 50 protein groups in the .prot.xml file. As Brendan mentioned, this is the last thing that we do after doing a search, so the .pep.xml data is already fully loaded and you can look at the peptides in CPAS. If the FASTA, .pep.xml, and .prop.xml files aren't too huge, you can post them here if you'd like and I can try loading them to determine why it can't find the right sequence. Josh

vensel responded:	2006-06-21 14:56
Hi Josh, The fasta file is 78MB and the each pep.xml file ranges from about 300 to 600kb.The prot.xml about 200 to 400 kb. How many of each would you want. I can zip everything and can send it to you if you have an ftp site. Thanks, Bill

jeckels responded:	2006-06-21 17:03
I would just need the FASTA file and one .pep.xml - .prot.xml pair. Unfortunately we don't currently have a good way to get large files. Is there somewhere you could put the files where I could access them over HTTP, FTP, etc? Thanks, Josh

jeckels responded:	2006-06-26 17:27
I just loaded both sets of files successfully. Is it possible that your initial FASTA upload was interrupted, leaving the database in a bad state? From your other posting, it sounds as though you've since blown away your old database and reinstalled. You could try re-importing these files. You don't need to run the X!Tandem search again. You should be able to use the Pipeline to browse to the .xar.xml files and click on the associated Import Experiment button. If you deleted a different database when reinstalling, you can try going to Manage Site->Admin Console->Protein Admin. In the FASTA Files section, select the relevent FASTA file and click on Reload. Like the initial load, this will take a while, but should reload the entire file. Josh

vensel responded:	2006-06-29 11:31
Thanks Josh, Well I was able to get the pep.xml files to load and everything looked okay. However when a MS2 run is open and I click on a peptide sequence the browser is redirected to the page displaying the ions, but they are not highlighted, and no graph is shown --just the texr "No data to plot" Bill

jeckels responded:	2006-06-29 17:47
That means there was some sort of problem with CPAS accessing the mzXML when it was loading the data. I loaded the runs that you sent me and I get graphs and highlighted ions, so CPAS managed to parse the mzXML files correctly. It may be a problem with the paths between the different files. I had to do edit the paths before loading since they were in a different directory structure on my machine and the files contain absolute paths to each other. Are there any entries in the .log files that indicate there was a problem? Do the .pep.xml files point to the right location when they reference the .mzXML files? Note that if you go to reload the MS2 runs you'll need to delete the existing runs in CPAS. Josh

vensel responded:	2006-06-30 15:27
Hi Josh, Well the problem was that I was loading only the pep.xml files. I reanalyzed everything with the mzxml files and every thing seems to be working. Thanks, Bill

adam responded:	2007-01-04 09:05

LabKey Support

LabKey Support

Loading Experiment Hangs

View Message