microarray data: lists

Study Forum (Inactive)
microarray data: lists alexander karpikov  2013-12-02 16:49
Status: Closed
 
Hi,
I am trying to set up a microarray pipeline using labkey.
I created a list in labkey containing 28 columns and 45000 rows so I can run R script in labkey using data from this list.
But when I try to use this list it becomes very slow and not very practical to use.
Do you have any advice on it?
Best,
Sasha.
 
 
jeckels responded:  2013-12-02 17:29
Hi Sasha,

First, if you're not already on LabKey Server 13.3, please consider upgrading. Version 13.2 includes a significant change to the way that we store lists in the database that increases performance. Version 13.3 adds some incremental improvements.

Is this normalized expression data that you're working with, with one row per gene/probe, and one column per sample? If so, we have a prototype module that might be of interest to you:

https://www.labkey.org/wiki/home/Documentation/page.view?name=geoMicroarrayTutorial

It should provide significantly better performance than importing the data into a list or general type assay.

I'd be happy to send you a copy of the module for version 13.3 if you're interested in giving it a try.

Thanks,
Josh
 
alexander karpikov responded:  2013-12-02 17:47
Hi Josh,
Thanks, It would be great!
I am working now with non-normalized data. I wanted to start with non-normalized data, run R script, create a second list of normalized data and do some analysis of normalized data with a second R script.
But I can start with normalized data instead.
I am currently using Labkey 13.2 but I can upgrade to 13.3. Should I also upgrade Tomcat and Java?
Please send me your modulus- I will be glad tom test it and use it.
Best,
Sasha.
 
jeckels responded:  2013-12-03 11:40
Hi Sasha,

Yes, upgrading Tomcat and Java are both good ideas. I'm not sure what versions you have installed currently, but I'd recommend upgrading one component at a time, verifying that everything is working as expected, and then upgrading the next. A reasonable sequence might be:

1. Java
2. LabKey Server
3. Tomcat

Note that 13.3 is the first version of LabKey Server to work with Tomcat 7, so 13.2 won't work with it.

I'm attaching the prototype module, which will work with 13.3 (but not 13.2). Please feel free to post questions or problems here on the support forum.

Thanks,
Josh
 
jeckels responded:  2013-12-03 11:41
I should add that the module doesn't care if your data is normalized or not. In the workflows that it's been used so far, the values happen to already be normalized.

Thanks,
Josh
 
alexander karpikov responded:  2013-12-03 11:57
Thank you Josh!
Do you have by any chance a sample data set which I can test using this modulus?
Sasha
 
jeckels responded:  2013-12-03 16:48
Hi Sasha,

Unfortunately the only data files I have at the moment contain confidential customer information. Please let me know if you're having trouble generating any of them in particular.

Thanks,
Josh