How to upload legacy Mascot search data into CPAS

CPAS Forum (Inactive)
How to upload legacy Mascot search data into CPAS ybukhman  2008-11-21 10:29
Status: Closed
 
Hi,

I'm new to CPAS. We have some legacy .dat files from Mascot searches. I would like to upload those to CPAS. What's the best way to do that?

Thanks.

Yury
 
 
jeckels responded:  2008-12-01 15:29
Hi Yury,

You should be able to import these files directly.

Create a new MS2 project (Manage Site->Create Project). The default permissions are fine.

Choose "MS2 Runs" from the <Select Part> drop down and click on "Add Web Part". Your imported runs will show up in this list. (Note that this step will be unnecessary in version 9.1).

Click on the Setup button under the Data Pipeline. Enter the directory that contains your .dat files.

Click on Process and Import Data. Browse to your .dat files and click on the the "Import Peptides" button.

This will import your existing Mascot .dat file.

The documentation at https://www.labkey.org/wiki/home/Documentation/page.view?name=ms2 describes how to view, compare, export, and do other things with the data after it's been loaded.

Thanks,
Josh
 
ybukhman responded:  2008-12-08 13:06
Hi,

the import fails to complete. First of all, I get a warning that it can't read my mzXML file, even though I do have it in the same folder as the .dat file. This doesn't seem to kill the process, though. Later on I get an error about an unexpected character. I tried to run Mascot2XML on the command line, and that did complete and generate files F010946.tgz and F010946.xml.

My log file is attached.

Thanks.

Yury
 
jeckels responded:  2008-12-10 15:53
Hi Yury,

It looks like there's a problem with the pepXML file that Mascot2XML generates. From the error message it looks like there's a XML syntax problem on or around line 11885.

Can you open the file and see if there's an obvious problem? You can paste that section of the file here or attach it if it's not too large.

Thanks,
Josh
 
ybukhman responded:  2008-12-11 08:57
Hi Josh,

here's a line-numbered listing around 11885. This file was generated by Mascot2XML run on the command line using the command shown in the CPAS log. The whole xml file is 14.7 MB: I don't know if this is too large or not. There's also a 42 MB .tgz file.

Thanks.

Yury


11800    </search_hit>
 11801    </search_result>
 11802    </spectrum_query>
 11803    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.7544.7544.2" start_scan="7544" end_scan="7544" precursor_neutral_mass="944.5437" assumed_charge="2" index="995" retention_time_sec="2912.500000">
 11804    <search_result>
 11805    <search_hit hit_rank="1" peptide="LGMTPVRR" peptide_prev_aa="K" peptide_next_aa="C" protein="40254165" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="14" calc_neutral_pep_mass="944.5225" massdiff="+0.0212" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="SMC hinge domain containing 1 [Mus musculus]">
 11806    <modification_info>
 11807    <mod_aminoacid_mass position="3" mass="147.035399"/>
 11808    </modification_info>
 11809    <search_score name="ionscore" value="25.36"/>
 11810    <search_score name="identityscore" value="33.63"/>
 11811    <search_score name="star" value="0"/>
 11812    <search_score name="homologyscore" value="30.97"/>
 11813    <search_score name="expect" value="0.34"/>
 11814    </search_hit>
 11815    </search_result>
 11816    </spectrum_query>
 11817    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.6403.6403.1" start_scan="6403" end_scan="6403" precursor_neutral_mass="944.5438" assumed_charge="1" index="996" retention_time_sec="2531.560000">
 11818    <search_result>
 11819    <search_hit hit_rank="1" peptide="VDCLLAVGR" peptide_prev_aa="K" peptide_next_aa="L" protein="REV20373161" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="16" calc_neutral_pep_mass="944.5113" massdiff="+0.0325" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="dynein, axonemal, heavy chain 5 [Mus musculus]REVERSED">
 11820    <search_score name="ionscore" value="13.58"/>
 11821    <search_score name="identityscore" value="33.63"/>
 11822    <search_score name="star" value="0"/>
 11823    <search_score name="homologyscore" value="25.96"/>
 11824    <search_score name="expect" value="5.06"/>
 11825    </search_hit>
 11826    </search_result>
 11827    </spectrum_query>
 11828    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.7177.7177.2" start_scan="7177" end_scan="7177" precursor_neutral_mass="944.5442" assumed_charge="2" index="997" retention_time_sec="2787.310000">
 11829    <search_result>
 11830    <search_hit hit_rank="1" peptide="TNALGSILR" peptide_prev_aa="R" peptide_next_aa="V" protein="31543189" num_tot_proteins="1" num_matched_ions="5" tot_num_ions="16" calc_neutral_pep_mass="944.5291" massdiff="+0.0152" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="hypothetical protein LOC244141 [Mus musculus]">
 11831    <modification_info>
 11832    <mod_aminoacid_mass position="2" mass="115.026939"/>
 11833    </modification_info>
 11834    <search_score name="ionscore" value="9.29"/>
 11835    <search_score name="identityscore" value="33.60"/>
 11836    <search_score name="star" value="0"/>
 11837    <search_score name="homologyscore" value="20.15"/>
 11838    <search_score name="expect" value="13.50"/>
 11839    </search_hit>
 11840    </search_result>
 11841    </spectrum_query>
 11842    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.6808.6808.2" start_scan="6808" end_scan="6808" precursor_neutral_mass="944.5448" assumed_charge="2" index="998" retention_time_sec="2665.910000">
 11843    <search_result>
 11844    <search_hit hit_rank="1" peptide="VAQNSIRR" peptide_prev_aa="K" peptide_next_aa="I" protein="38086485" num_tot_proteins="1" num_matched_ions="6" tot_num_ions="14" calc_neutral_pep_mass="944.5039" massdiff="+0.0409" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="PREDICTED: ATP-binding cassette, sub-family B (MDR/TAP), member 7 [Mus musculus]">
 11845    <modification_info>
 11846    <mod_aminoacid_mass position="3" mass="129.042589"/>
 11847    <mod_aminoacid_mass position="4" mass="115.026939"/>
 11848    </modification_info>
 11849    <search_score name="ionscore" value="29.67"/>
 11850    <search_score name="identityscore" value="33.64"/>
 11851    <search_score name="star" value="0"/>
 11852    <search_score name="homologyscore" value="38.91"/>
 11853    <search_score name="expect" value="0.12"/>
 11854    </search_hit>
 11855    </search_result>
 11856    </spectrum_query>
 11857    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.6077.6077.2" start_scan="6077" end_scan="6077" precursor_neutral_mass="944.5449" assumed_charge="2" index="999" retention_time_sec="2423.620000">
 11858    <search_result>
 11859    <search_hit hit_rank="1" peptide="VAQNSIRR" peptide_prev_aa="K" peptide_next_aa="I" protein="38086485" num_tot_proteins="1" num_matched_ions="5" tot_num_ions="14" calc_neutral_pep_mass="944.5039" massdiff="+0.0410" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="PREDICTED: ATP-binding cassette, sub-family B (MDR/TAP), member 7 [Mus musculus]">
 11860    <modification_info>
 11861    <mod_aminoacid_mass position="3" mass="129.042589"/>
 11862    <mod_aminoacid_mass position="4" mass="115.026939"/>
 11863    </modification_info>
 11864    <search_score name="ionscore" value="16.87"/>
 11865    <search_score name="identityscore" value="33.64"/>
 11866    <search_score name="star" value="0"/>
 11867    <search_score name="homologyscore" value="29.58"/>
 11868    <search_score name="expect" value="2.37"/>
 11869    </search_hit>
 11870    </search_result>
 11871    </spectrum_query>
 11872    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.4439.4439.2" start_scan="4439" end_scan="4439" precursor_neutral_mass="945.4943" assumed_charge="2" index="1000" retention_time_sec="1879.600000">
 11873    <search_result>
 11874    <search_hit hit_rank="1" peptide="SRQTGELR" peptide_prev_aa="R" peptide_next_aa="V" protein="REV63546151" num_tot_proteins="3" num_matched_ions="4" tot_num_ions="14" calc_neutral_pep_mass="945.4992" massdiff="-0.0049" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="PREDICTED: similar to lymphocyte antigen-49H2 [Mus musculus]REVERSED">
 11875    <search_score name="ionscore" value="7.00"/>
 11876    <search_score name="identityscore" value="35.26"/>
 11877    <search_score name="star" value="0"/>
 11878    <search_score name="homologyscore" value="19.68"/>
 11879    <search_score name="expect" value="33.47"/>
 11880    </search_hit>
 11881    </search_result>
 11882    </spectrum_query>
 11883    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.13194.13194.3" start_scan="13194" end_scan="13194" precursor_neutral_mass="945.6147" assumed_charge="3" index="1001" retention_time_sec="4923.100000">
 11884    <search_result>
 11885    <search_hit hit_rank="1" peptide="VAKRPFTK" peptide_prev_aa="K" peptide_next_aa="S" protein="REV29126213" num_tot_proteins="1" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0388" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="DNA segment, Chr 19, Brigham & Women's Genetics 1357 expressed [Mus musculus]REVERSED">
 11886    <search_score name="ionscore" value="1.52"/>
 11887    <search_score name="identityscore" value="20.83"/>
 11888    <search_score name="star" value="0"/>
 11889    <search_score name="homologyscore" value="14.52"/>
 11890    <search_score name="expect" value="4.26"/>
 11891    </search_hit>
 11892    </search_result>
 11893    </spectrum_query>
 11894    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.1735.1735.3" start_scan="1735" end_scan="1735" precursor_neutral_mass="945.6147" assumed_charge="3" index="1002" retention_time_sec="859.619000">
 11895    <search_result>
 11896    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0388" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11897    <search_score name="ionscore" value="5.99"/>
 11898    <search_score name="identityscore" value="20.83"/>
 11899    <search_score name="star" value="0"/>
 11900    <search_score name="homologyscore" value="17.55"/>
 11901    <search_score name="expect" value="1.52"/>
 11902    </search_hit>
 11903    </search_result>
 11904    </spectrum_query>
 11905    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.24651.24651.3" start_scan="24651" end_scan="24651" precursor_neutral_mass="945.6147" assumed_charge="3" index="1003" retention_time_sec="8999.270000">
 11906    <search_result>
 11907    <search_hit hit_rank="1" peptide="HLLSVLHK" peptide_prev_aa="R" peptide_next_aa="I" protein="51708403" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="28" calc_neutral_pep_mass="945.5760" massdiff="+0.0388" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="PREDICTED: hypothetical protein LOC71824 [Mus musculus]">
 11908    <search_score name="ionscore" value="1.19"/>
 11909    <search_score name="identityscore" value="20.83"/>
 11910    <search_score name="star" value="0"/>
 11911    <search_score name="homologyscore" value="13.71"/>
 11912    <search_score name="expect" value="4.60"/>
 11913    </search_hit>
 11914    </search_result>
 11915    </spectrum_query>
 11916    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.40.40.3" start_scan="40" end_scan="40" precursor_neutral_mass="945.6147" assumed_charge="3" index="1004" retention_time_sec="12.406800">
 11917    <search_result>
 11918    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="4" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0388" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11919    <search_score name="ionscore" value="10.04"/>
 11920    <search_score name="identityscore" value="20.83"/>
 11921    <search_score name="star" value="0"/>
 11922    <search_score name="homologyscore" value="21.30"/>
 11923    <search_score name="expect" value="0.60"/>
 11924    </search_hit>
 11925    </search_result>
 11926    </spectrum_query>
 11927    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.1478.1478.3" start_scan="1478" end_scan="1478" precursor_neutral_mass="945.6157" assumed_charge="3" index="1005" retention_time_sec="738.715000">
 11928    <search_result>
 11929    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0398" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11930    <search_score name="ionscore" value="0.88"/>
 11931    <search_score name="identityscore" value="19.73"/>
 11932    <search_score name="star" value="0"/>
 11933    <search_score name="homologyscore" value="13.88"/>
 11934    <search_score name="expect" value="3.84"/>
 11935    </search_hit>
 11936    </search_result>
 11937    </spectrum_query>
 11938    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.2280.2280.3" start_scan="2280" end_scan="2280" precursor_neutral_mass="945.6167" assumed_charge="3" index="1006" retention_time_sec="1100.970000">
 11939    <search_result>
 11940    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="3" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0408" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11941    <search_score name="ionscore" value="0.59"/>
 11942    <search_score name="identityscore" value="19.73"/>
 11943    <search_score name="star" value="0"/>
 11944    <search_score name="homologyscore" value="13.30"/>
 11945    <search_score name="expect" value="4.10"/>
 11946    </search_hit>
 11947    </search_result>
 11948    </spectrum_query>
 11949    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.981.981.3" start_scan="981" end_scan="981" precursor_neutral_mass="945.6167" assumed_charge="3" index="1007" retention_time_sec="496.189000">
 11950    <search_result>
 11951    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0408" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11952    <search_score name="ionscore" value="2.05"/>
 11953    <search_score name="identityscore" value="19.73"/>
 11954    <search_score name="star" value="0"/>
 11955    <search_score name="homologyscore" value="15.05"/>
 11956    <search_score name="expect" value="2.93"/>
 11957    </search_hit>
 11958    </search_result>
 11959    </spectrum_query>
 11960    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.1220.1220.3" start_scan="1220" end_scan="1220" precursor_neutral_mass="945.6167" assumed_charge="3" index="1008" retention_time_sec="617.308000">
 11961    <search_result>
 11962    <search_hit hit_rank="1" peptide="ILYKGKPK" peptide_prev_aa="R" peptide_next_aa="G" protein="REV27369491" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="28" calc_neutral_pep_mass="945.6011" massdiff="+0.0156" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="hypothetical protein LOC68169 [Mus musculus]REVERSED">
 11963    <search_score name="ionscore" value="1.55"/>
 11964    <search_score name="identityscore" value="19.73"/>
 11965    <search_score name="star" value="0"/>
 11966    <search_score name="homologyscore" value="13.64"/>
 11967    <search_score name="expect" value="3.29"/>
 11968    </search_hit>
 11969    </search_result>
 11970    </spectrum_query>
 11971    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.743.743.3" start_scan="743" end_scan="743" precursor_neutral_mass="945.6167" assumed_charge="3" index="1009" retention_time_sec="375.519000">
 11972    <search_result>
 11973    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0408" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11974    <search_score name="ionscore" value="8.54"/>
 11975    <search_score name="identityscore" value="19.73"/>
 11976    <search_score name="star" value="0"/>
 11977    <search_score name="homologyscore" value="16.47"/>
 11978    <search_score name="expect" value="0.66"/>
 11979    </search_hit>
 11980    </search_result>
 11981    </spectrum_query>
 11982    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.25995.25995.3" start_scan="25995" end_scan="25995" precursor_neutral_mass="945.6177" assumed_charge="3" index="1010" retention_time_sec="9608.820000">
 11983    <search_result>
 11984    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="5" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0418" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11985    <search_score name="ionscore" value="14.58"/>
 11986    <search_score name="identityscore" value="19.73"/>
 11987    <search_score name="star" value="0"/>
 11988    <search_score name="homologyscore" value="27.58"/>
 11989    <search_score name="expect" value="0.16"/>
 11990    </search_hit>
 11991    </search_result>
 11992    </spectrum_query>
 11993    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.267.267.3" start_scan="267" end_scan="267" precursor_neutral_mass="945.6177" assumed_charge="3" index="1011" retention_time_sec="132.873000">
 11994    <search_result>
 11995    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="4" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0418" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 11996    <search_score name="ionscore" value="3.25"/>
 11997    <search_score name="identityscore" value="19.73"/>
 11998    <search_score name="star" value="0"/>
 11999    <search_score name="homologyscore" value="16.25"/>
 12000    <search_score name="expect" value="2.22"/>
 12001    </search_hit>
 12002    </search_result>
 12003    </spectrum_query>
 12004    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.504.504.3" start_scan="504" end_scan="504" precursor_neutral_mass="945.6177" assumed_charge="3" index="1012" retention_time_sec="254.330000">
 12005    <search_result>
 12006    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0418" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 12007    <search_score name="ionscore" value="0.28"/>
 12008    <search_score name="identityscore" value="19.73"/>
 12009    <search_score name="star" value="0"/>
 12010    <search_score name="homologyscore" value="13.28"/>
 12011    <search_score name="expect" value="4.41"/>
 12012    </search_hit>
 12013    </search_result>
 12014    </spectrum_query>
 12015    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.2005.2005.3" start_scan="2005" end_scan="2005" precursor_neutral_mass="945.6187" assumed_charge="3" index="1013" retention_time_sec="980.601000">
 12016    <search_result>
 12017    <search_hit hit_rank="1" peptide="IIFAQLLK" peptide_prev_aa="-" peptide_next_aa="F" protein="REV6680554" num_tot_proteins="1" num_matched_ions="5" tot_num_ions="28" calc_neutral_pep_mass="945.5899" massdiff="+0.0289" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="keratocan [Mus musculus]REVERSED">
 12018    <modification_info>
 12019    <mod_aminoacid_mass position="5" mass="129.042589"/>
 12020    </modification_info>
 12021    <search_score name="ionscore" value="1.41"/>
 12022    <search_score name="identityscore" value="19.73"/>
 12023    <search_score name="star" value="0"/>
 12024    <search_score name="homologyscore" value="13.78"/>
 12025    <search_score name="expect" value="3.40"/>
 12026    </search_hit>
 12027    </search_result>
 12028    </spectrum_query>
 12029    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.25731.25731.3" start_scan="25731" end_scan="25731" precursor_neutral_mass="945.6187" assumed_charge="3" index="1014" retention_time_sec="9488.490000">
 12030    <search_result>
 12031    <search_hit hit_rank="1" peptide="VIAFQLKK" peptide_prev_aa="K" peptide_next_aa="D" protein="63504796" num_tot_proteins="2" num_matched_ions="4" tot_num_ions="28" calc_neutral_pep_mass="945.6011" massdiff="+0.0176" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="PREDICTED: similar to integrin, alpha 10 precursor [Mus musculus]">
 12032    <search_score name="ionscore" value="2.19"/>
 12033    <search_score name="identityscore" value="19.73"/>
 12034    <search_score name="star" value="0"/>
 12035    <search_score name="homologyscore" value="14.52"/>
 12036    <search_score name="expect" value="2.84"/>
 12037    </search_hit>
 12038    </search_result>
 12039    </spectrum_query>
 12040    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.25191.25191.3" start_scan="25191" end_scan="25191" precursor_neutral_mass="945.6207" assumed_charge="3" index="1015" retention_time_sec="9244.910000">
 12041    <search_result>
 12042    <search_hit hit_rank="1" peptide="VAVNFKLR" peptide_prev_aa="K" peptide_next_aa="E" protein="REV6678165" num_tot_proteins="2" num_matched_ions="2" tot_num_ions="28" calc_neutral_pep_mass="945.5759" massdiff="+0.0448" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="testis-specific serine kinase 1 [Mus musculus]REVERSED">
 12043    <search_score name="ionscore" value="1.26"/>
 12044    <search_score name="identityscore" value="19.73"/>
 12045    <search_score name="star" value="0"/>
 12046    <search_score name="homologyscore" value="14.26"/>
 12047    <search_score name="expect" value="3.52"/>
 12048    </search_hit>
 12049    </search_result>
 12050    </spectrum_query>
 12051    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.4821.4821.2" start_scan="4821" end_scan="4821" precursor_neutral_mass="946.5197" assumed_charge="2" index="1016" retention_time_sec="2008.030000">
 12052    <search_result>
 12053    <search_hit hit_rank="1" peptide="AMELQSIR" peptide_prev_aa="R" peptide_next_aa="A" protein="REV6754750" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="14" calc_neutral_pep_mass="946.4906" massdiff="+0.0292" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="moesin [Mus musculus]REVERSED">
 12054    <search_score name="ionscore" value="8.27"/>
 12055    <search_score name="identityscore" value="33.98"/>
 12056    <search_score name="star" value="0"/>
 12057    <search_score name="homologyscore" value="18.84"/>
 12058    <search_score name="expect" value="18.62"/>
 12059    </search_hit>
 12060    </search_result>
 12061    </spectrum_query>
 12062    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.6811.6811.2" start_scan="6811" end_scan="6811" precursor_neutral_mass="948.4677" assumed_charge="2" index="1017" retention_time_sec="2666.700000">
 12063    <search_result>
 12064    <search_hit hit_rank="1" peptide="VNLGVGAYR" peptide_prev_aa="K" peptide_next_aa="T" protein="6754034" num_tot_proteins="1" num_matched_ions="3" tot_num_ions="16" calc_neutral_pep_mass="948.5028" massdiff="-0.0351" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="glutamate oxaloacetate transaminase 1, soluble [Mus musculus]">
 12065    <modification_info>
 12066    <mod_aminoacid_mass position="2" mass="115.026939"/>
 12067    </modification_info>
 12068    <search_score name="ionscore" value="10.08"/>
 12069    <search_score name="identityscore" value="35.05"/>
 12070    <search_score name="star" value="0"/>
 12071    <search_score name="homologyscore" value="20.04"/>
 12072    <search_score name="expect" value="15.70"/>
 12073    </search_hit>
 12074    </search_result>
 12075    </spectrum_query>
 12076    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.4032.4032.2" start_scan="4032" end_scan="4032" precursor_neutral_mass="948.4987" assumed_charge="2" index="1018" retention_time_sec="1745.440000">
 12077    <search_result>
 12078    <search_hit hit_rank="1" peptide="HNGVVPVVK" peptide_prev_aa="K" peptide_next_aa="E" protein="63624633" num_tot_proteins="2" num_matched_ions="3" tot_num_ions="16" calc_neutral_pep_mass="948.5392" massdiff="-0.0405" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="PREDICTED: similar to neurexin III [Mus musculus]">
 12079    <modification_info>
 12080    <mod_aminoacid_mass position="2" mass="115.026939"/>
 12081    </modification_info>
 12082    <search_score name="ionscore" value="4.42"/>
 12083    <search_score name="identityscore" value="34.46"/>
 12084    <search_score name="star" value="0"/>
 12085    <search_score name="homologyscore" value="16.38"/>
 12086    <search_score name="expect" value="50.51"/>
 12087    </search_hit>
 12088    </search_result>
 12089    </spectrum_query>
 12090    <spectrum_query spectrum="E:UserDataEdEFcolon_DTAsEFcolon.15789.15789.2" start_scan="15789" end_scan="15789" precursor_neutral_mass="948.5287" assumed_charge="2" index="1019" retention_time_sec="5841.170000">
 12091    <search_result>
 12092    <search_hit hit_rank="1" peptide="DFIAAVWK" peptide_prev_aa="K" peptide_next_aa="L" protein="REV63561913" num_tot_proteins="2" num_matched_ions="4" tot_num_ions="14" calc_neutral_pep_mass="948.5069" massdiff="+0.0219" num_tol_term="2" num_missed_cleavages="0" is_rejected="0" protein_descr="PREDICTED: similar to very large inducible GTPase-1 [Mus musculus]REVERSED">
 12093    <search_score name="ionscore" value="12.80"/>
 12094    <search_score name="identityscore" value="32.43"/>
 12095    <search_score name="star" value="0"/>
 12096    <search_score name="homologyscore" value="24.23"/>
 12097    <search_score name="expect" value="4.59"/>
 12098    </search_hit>
 12099    </search_result>
 12100    </spectrum_query>
 
jeckels responded:  2008-12-11 09:53
Hi Yury,

I think that the problem is that your pepXML has an unencoded ampersand in an XML attribute on line 11885. (See http://www.w3.org/TR/xhtml1/#C_12 for details on this.) This is almost certainly a bug in Mascot2XML. You should be able to fix this by replacing the "&" with "&amp;" in the pepXML file and then importing that edited file. If there are other protein names later in the file that also have an ampersand they'll need to be fixed as well. It's possible that you'll also need to encode the apostrophe on that line as "&apos;"

Assuming that fixes the problem, we'd need to work with the TPP developers to fix Mascot2XML so you could avoid the manual fixup.

Thanks,
Josh
 
ybukhman responded:  2008-12-11 10:54
Hi Josh,

after I fix the xml file, how do I upload it into CPAS?

Also, this is not the first ampersand in the file. Another one occurs much earlier:
  8570 <search_hit hit_rank="1" peptide="VINRSYK" peptide_prev_aa="K" peptide_next_aa="A" protein="REV29126213" num_tot_proteins="1" num_matched_ions="3" tot_num_ions="12" calc_neutral_pep_mass="878.4974" massdiff="-0.0306" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="DNA segment, Chr 19, Brigham & Women's Genetics 1357 expressed [Mus musculus]REVERSED">


Thanks.

Yury
 
jeckels responded:  2008-12-11 11:12
Hi Yury,

After you edit the file, you should be able to browse to it using the Data Pipeline's Process and Import Files button, just like you used to try to import the DAT file directly. The pepXML file should show up with an Import button next to it.

Thanks,
Josh
 
ybukhman responded:  2008-12-11 13:56
Hi Josh,

after fixing the ampersands, I also had to remove angle brackets from lines like this:
[root@ybdesk MS1]# diff F010946.pep.xml--back2 F010946.pep.xml
53211c53211
< <search_hit hit_rank="1" peptide="DVLGSAASGARLSPSR" peptide_prev_aa="K" peptide_next_aa="T" protein="6753238" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="30" calc_neutral_pep_mass="1542.8114" massdiff="-0.0286" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="Ca<2+>dependent activator protein for secretion [Mus musculus]">
---
> <search_hit hit_rank="1" peptide="DVLGSAASGARLSPSR" peptide_prev_aa="K" peptide_next_aa="T" protein="6753238" num_tot_proteins="1" num_matched_ions="4" tot_num_ions="30" calc_neutral_pep_mass="1542.8114" massdiff="-0.0286" num_tol_term="2" num_missed_cleavages="1" is_rejected="0" protein_descr="Ca2+ dependent activator protein for secretion [Mus musculus]">
120429c120429
..........

Now the upload seemed to proceed to completion, but the run still did not appear in the "MS2 Runs" web part of the Dashboard. When I tried to import it again, nothing happened, and the log said this:
11 Dec 2008 15:52:51,031 INFO : F010946.pep.xml has already been imported so it does not need to be imported again

Thanks.

Yury
 
jeckels responded:  2008-12-16 17:56
Hi Yury,

Glad you got the file to import. To see the run, on your folder's portal page, choose the "MS2 Runs" web part from the list of web parts at the bottom of the page and click on "Add Web Part". The default list of runs only shows runs that are wrapped in experimental annotations, which is not true for existing runs that are imported.

I know this is confusing, but I'm happy to report that the next version fixes this problem and will wrap imported runs so that they show up in the enhanced list.

Thanks,
Josh