LImit to insertrows query

LabKey Support Forum
LImit to insertrows query WayneH  2018-06-20 07:01
Status: Closed
 

Hello all,

we employ a script on our labkey server v16.3 (prof+ edition) that inserts a data array of 10k+ entries via the insertrows api call and it is failing.. it has no issues with smaller chunks of data so it appears that we have crossed a size limit.. a fix is kinda obvious but wanted to confirm with you what those limits are and if chunking things would be the solution or you may have an alternate approach you could recommend.

thanks,
WayneH

 
 
Jon (LabKey DevOps) responded:  2018-06-29 12:43
Hi Wayne,

Yes, it is recommended that you batch up your data into smaller chunks when you're in that higher level of data rows.

In-fact, if you've ever done a study import that had a few thousand rows of participants in it, you'll see the pipeline log will indicate a breakdown of copying rows over typically within batches of 1000 rows.

Regards,

Jon
 
Jon (LabKey DevOps) responded:  2018-06-29 12:51
Additionally, depending on the API you're using, you could also use the importRows call:

https://cran.r-project.org/web/packages/Rlabkey/Rlabkey.pdf
 
WayneH responded:  2018-07-18 11:48
Hey Jon..

needed to follow up with you guys as we explored this a little and batched the jobs using the insertrows call but still found a hard limit around 9900 rows.. Unfortunately we have a process that rewrites the using an external db (per requirements) which is a permanent repository so we have to work with what we are given if possible...
Not sure where this hard limit is set or how we can work around it..
Any thoughts?

Thanks,

Wayne H
 
Jon (LabKey DevOps) responded:  2018-07-19 13:11
Hi Wayne,

We're going to need to know more here regarding your problem.

You said there is a "hard limit around 9900 rows". How did you determine this? Were there any exception messages or errors that came up when you reached that threshold? Did the server hang or do anything else unusual that impacted performance? Was there anything within your labkey.log file that corresponds with this hard limit you described?

The only possible suggestion is to raise you JVM heap size on the server (assuming it's a memory issue), but we really can't make any kind of workaround recommendations or suggestions without more information.

Regards,

Jon
 
WayneH responded:  2018-07-23 08:24
Hi Jon,

we arrived at this number by deduction honestly.... we used chunks of decreasing size to understand why we were having issues pushing data into the table via the API.. we ran chunk sizes of 1000 500 100 and 10rows.
 2000 row chunks stopped at 8000 inserted.
 1000 row chunks stopped at 9000 inserted.
 500 row chunks stopped at 9500 inserted
 100 row chunks stopped at 9800 inserted and
 10 row chunks stopped at 9890...

It seemed reasonable to assume we were narrowing in on a 9900 row limit which I think we can all agree makes little sense but we aren't sure why this is so...

Took a look at the log file..
Labkey.log file indicated "java.sql.SQLException: Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them."

The attached file contains that error report...

Thanks,

WH
 
WayneH responded:  2018-07-23 08:25
Hit 'submit' too quickly..
Here is the file...
 
Jon (LabKey DevOps) responded:  2018-07-27 14:18
Hi Wayne,

That SQLException error is a limitation within Microsoft SQL Server and will come up if you have too many identifiers and constants in your query. MS SQL has a limit of 65,535.

More details: https://support.microsoft.com/en-us/help/913050/error-message-when-you-run-a-query-in-sql-server-2005-internal-error-a

The only way to get around this is to work in smaller batches when it comes to your queries. Looking at that SQL that was within your log, it looks like you're doing a lot of CASE statements, which explains why you're hitting this limit.

Regards,

Jon
 
WayneH responded:  2018-08-09 07:10
Hey Jon,

We need to discuss with all teams on our end but what still confuses me is we assumed that the case statements were being generated by LabKey's API.. It's really unclear to us why those were being generated... So just to confirm you believe that this issue is a SQL server issue not LabKey?

Thanks,

WH
 
Jon (LabKey DevOps) responded:  2018-08-23 13:45
Hi Wayne,

It is a SQL issue, but also a LabKey issue as well. But the workaround is the same, which is to use smaller batches until we can revise our code to manage things better to work with SQL Server's own limitations.

Regards,

Jon