Import large proteomic assay to labkey: /home/Support/LabKey Support Forum

Import large proteomic assay to labkey

LabKey Support Forum (Inactive)

View Message

Import large proteomic assay to labkey johann pellet 2020-05-12 09:48

Status: Active

Hi all,

I am trying to upload a large proteomic RMN dataset (9600 columns x 240 rows) into Labkey 19.3.
Below a subset of the matrix:

Sample_ID	9.99950027	9.99849987	9.99750042
  L-1025-01	3547.56219015	4502.33293817	3747.21499051
  L-1025-02	-918.88494389	1544.06141934	-553.62238202

The input matrix consists of N rows of samples with M columns of bin intensities.

Beforehand, I created the sample set NMR samples (below a subset)

Name	Volume	Unit
L-1025-01	130.0	uL
L-1025-02 	130.0	uL

When I trying to import the Assay, I choose the assay type General and in the Results Fields, I delete all the default fields before I import my matrix to Labkey.

This method does not work because labkey does not accept to import more than 1600 columns. See below an extract of the labkey.log

ERROR BaseApiAction            2020-05-12 13:02:16,710      ajp-nio-8009-exec-5 : ApiAction exception:                                    
org.springframework.dao.DataAccessResourceFailureException: SqlExecutor.execute(); SQL []; ERROR: tabl                                    es can have at most 1600 columns; nested exception is org.postgresql.util.PSQLException: ERROR: tables                                     can have at most 1600 columns

So what I did is to transpose my matrix like that (240 columns x 6900 rows):

Feature_ID  L-1025-01	  L-1025-02
9.99950027	3547.56219	-918.8849439
9.99849987	4502.332938	1544.061419
9.99750042	3747.214991	-553.622382

The Assay was created into Labkey after a very long time and an internal server error (I should check the Tomcat and/or Apache configuration).
But now, I don't see how I could import this Assay to my study matching my Sample Set created before. Indeed, when I click to Copy to Study, Participant IDs and Visit IDs are required for all rows. The problem is that each row is not a Sample_ID but a feature.... See images attached.
Ho I could manage this kind of array into Labkey. Should I do it with an other method?
Thank you for your help.

Regards,
Johann

Annotation 2020-05-12 184227.png

chetc (LabKey Support) responded:	2020-05-26 16:13
Status: Closed
Hello Johann, Generally the answer for modeling this kind of data in a relational DB is to make it a long skinny table. Instead of a row per sample and a column per feature, or a row per feature and a column per sample, you could make it one row per sample/feature combination (2.3 million rows). What was the internal server error you got? Thanks, Chet