Critical Error when Upgrading to Labkey 15.2

LabKey Support Forum (Inactive)
Critical Error when Upgrading to Labkey 15.2 lewis j mcgibbney  2015-08-24 15:05
Status: Closed
 
Hi Folks,
We are experiencing critical errors when attempting to upgrade Labkey server as follows.

* Java 1.7.X --> 1.8.X
* Labkey 14.4 --> 15.2
* PostgreSQL 8.4 --> 9.4
* Apache Tomcat 7.0.X

Once the DB has been backed up, we are attempting to run the upgrade script on CentOS release 6.6 (Final) as follows

./manual-upgrade.sh -l /data/local/LabKey14.3-35337.10-bin -d /root/LabKey15.2-39071.18-bin -c /opt/apache-tomcat-7.0.57 -u tomcat7

I am having issues with the upgrade. I worked on it for a while but can't get past the following errors after the upgrade (errors from labkey.org in the tomcat/logs folder):

ERROR ModuleLoader 2015-08-21 09:12:36,898 localhost-startStop-1 : Failure occurred during ModuleLoader init.
javax.servlet.ServletException: java.lang.IllegalArgumentException: DA0000091
        at org.labkey.api.module.ModuleLoader.upgradeCoreModule(ModuleLoader.java:1018)
        at org.labkey.api.module.ModuleLoader.doInit(ModuleLoader.java:329)
        at org.labkey.api.module.ModuleLoader.init(ModuleLoader.java:230)
        at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:279)
        at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:260)
        at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:105)
        at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4830)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5510)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649)
        at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:672)
        at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1859)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: DA0000091
        at org.labkey.api.util.GUID.<init>(GUID.java:277)
        at org.labkey.api.data.Container.<init>(Container.java:128)
        at org.labkey.api.data.ContainerManager$ContainerFactory.handle(ContainerManager.java:2443)
        at org.labkey.api.data.ContainerManager$ContainerFactory.handleArrayList(ContainerManager.java:2456)
        at org.labkey.api.data.BaseSelector$ArrayListResultSetHandler.handle(BaseSelector.java:146)
        at org.labkey.api.data.BaseSelector$1.handle(BaseSelector.java:167)
        at org.labkey.api.data.BaseSelector$1.handle(BaseSelector.java:163)
        at org.labkey.api.data.BaseSelector.handleResultSet(BaseSelector.java:256)
        at org.labkey.api.data.BaseSelector.getObject(BaseSelector.java:163)
        at org.labkey.api.data.BaseSelector.getObject(BaseSelector.java:158)
        at org.labkey.api.data.ContainerManager.getForPath(ContainerManager.java:897)
        at org.labkey.api.data.ContainerManager.getForPath(ContainerManager.java:882)
        at org.labkey.api.data.ContainerManager.getRoot(ContainerManager.java:936)
        at org.labkey.api.settings.AppProps.getWriteableInstance(AppProps.java:48)
        at org.labkey.core.CoreModule.afterUpdate(CoreModule.java:498)
        at org.labkey.api.module.ModuleUpgrader.upgrade(ModuleUpgrader.java:60)
        at org.labkey.api.module.ModuleLoader.upgradeCoreModule(ModuleLoader.java:1009)

If I restart tomcat I then see the following errors:


org.labkey.api.data.SqlScriptRunner$SqlScriptException: core-14.30-14.31.sql : SqlExecutor.execute(); bad SQL grammar []; nested exception is org.postgresql.util.PSQLException: ERROR: function "bulkimport" already exists with same argument types
       at org.labkey.api.data.SqlScriptManager.runScript(SqlScriptManager.java:208)
       at org.labkey.api.data.SqlScriptRunner.runScripts(SqlScriptRunner.java:82)
       at org.labkey.api.module.DefaultModule.versionUpdate(DefaultModule.java:366)
       at org.labkey.api.module.ModuleUpgrader.upgrade(ModuleUpgrader.java:59)
       at org.labkey.api.module.ModuleLoader.upgradeCoreModule(ModuleLoader.java:1009)
       at org.labkey.api.module.ModuleLoader.doInit(ModuleLoader.java:329)
       at org.labkey.api.module.ModuleLoader.init(ModuleLoader.java:230)
       at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:279)
       at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:260)
       at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:105)
       at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4830)
       at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5510)
       at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
       at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
       at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
       at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649)
       at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:672)
       at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1859)
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
Caused by: org.springframework.jdbc.BadSqlGrammarException: SqlExecutor.execute(); bad SQL grammar []; nested exception is org.postgresql.util.PSQLException: ERROR: function "bulkimport" already exists with same argument types
       at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:97)
       at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
       at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
       at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
       at org.labkey.api.data.ExceptionFramework$1.translate(ExceptionFramework.java:37)
       at org.labkey.api.data.ExceptionFramework$1.translate(ExceptionFramework.java:31)
       at org.labkey.api.data.SqlExecutor.execute(SqlExecutor.java:104)
       at org.labkey.api.data.SqlExecutor.execute(SqlExecutor.java:73)
       at org.labkey.api.data.SqlExecutor.execute(SqlExecutor.java:68)
       at org.labkey.api.data.SqlScriptExecutor$Block.execute(SqlScriptExecutor.java:201)
       at org.labkey.api.data.SqlScriptExecutor.execute(SqlScriptExecutor.java:96)
       at org.labkey.api.data.dialect.SqlDialect.runSql(SqlDialect.java:597)
       at org.labkey.api.data.SqlScriptManager.runScript(SqlScriptManager.java:203)
       ... 22 more
Caused by: org.postgresql.util.PSQLException: ERROR: function "bulkimport" already exists with same argument types
       at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2270)
       at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1998)
       at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
       at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:570)
       at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:406)
       at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:398)
       at org.apache.tomcat.dbcp.dbcp.DelegatingStatement.execute(DelegatingStatement.java:264)
       at org.apache.tomcat.dbcp.dbcp.DelegatingStatement.execute(DelegatingStatement.java:264)
       at org.labkey.api.data.dialect.StatementWrapper.execute(StatementWrapper.java:1908)
       at org.labkey.api.data.SqlExecutor$NormalStatementExecutor.execute(SqlExecutor.java:131)
       at org.labkey.api.data.SqlExecutor$NormalStatementExecutor.execute(SqlExecutor.java:119)
       at org.labkey.api.data.SqlExecutor.execute(SqlExecutor.java:97)

Any ideas what is going wrong here please? It is crucial that we are able to use the data already present within the database.
Thank you in advance.
Lewis
 
 
jeckels responded:  2015-08-24 18:43
Hi Lewis,

It looks like there may be suspicious values in the core.containers table. Can you try running this query directly against Postgres and share the results here?

SELECT * FROM core.containers WHERE Parent IS NULL

Thanks,
Josh
 
lewis j mcgibbney responded:  2015-08-25 08:42
Hi Josh,
Yes will paste here very shortly.
Thanks for reply.
Lewis
 
lewis j mcgibbney responded:  2015-08-25 13:08
Hi Josh,
OK, this query returns nothing. Nothing at all.
Lewis
 
Jon (LabKey DevOps) responded:  2015-08-25 13:19
Hi Lewis,

That's really odd. Usually you get at least one entry. Can you re-run that same SQL query directly on your database but remove the WHERE clause after it so it's just:

SELECT * FROM core.containers

Just want to make sure you actually are getting something.

Regards,

Jon
 
dennisw responded:  2015-08-25 13:24
Could this be related to the core-14.30-14.31.sql (or other relevant SQL script) having run 'CREATE FUNCTION core.bulkImport' already? We've had a few issues over the years that involved the SQL upgrade scripts trying to create something that was already there and had similar looking errors. We use SQL Server, but core.bulkImport lives in Programmability->Stored Procedures. I'm assuming that \df in psql would show whether the core.bulkImport SP already exists.
 
adam responded:  2015-08-25 14:27
The underlying problem appears to be a bad GUID in the Container table (see the first exception). That problem left the upgrade versioning in a bad state, which caused it to re-run SQL it had already run. So, bulkImport is a symptom, not the cause.
 
lewis j mcgibbney responded:  2015-08-25 14:54
@Jon,

labkey=# SELECT * FROM core.containers
labkey-#

Nothing
 
lewis j mcgibbney responded:  2015-08-25 14:59
Any ideas as to a solution here? This is blocking us from upgrading to 15.2 and a real issue right now. Thanks again in advance for any pointers.
 
Jon (LabKey DevOps) responded:  2015-08-25 15:10
Hi Lewis,

I think your database might be problematic here. Any chance you can restore your backup that you made from earlier? I've checked older instances of LabKey on Postgres and running a SELECT * FROM core.containers or the original query should have given you something.

Can you restore your database and then run the original query Josh mentioned?

Regards,

Jon
 
lewis j mcgibbney responded:  2015-08-25 16:17
Hi Jon,

We restored our original database and typed the same commands:

postgres=# \c labkey
You are now connected to database "labkey" as user "postgres".
labkey=# SELECT * FROM core.containers WHERE Parent IS NULL
labkey-# SELECT * FROM core.containers
labkey-#

Nothing...
 
Jon (LabKey DevOps) responded:  2015-08-25 16:23
Hi Lewis,

Are you certain you're connected to the right database? Do you have more than one database for LabKey?

Can you run a \l while running psql to bring up your list of databases? Does your older instance of LabKey 14.3 still come up when you run it on the old DB?

There should have been something that appeared when running at least SELECT * FROM core.containers.

Also, are you making sure that you end your query with a semi-colon before hitting enter?

Regards,

Jon
 
Brian Connolly responded:  2015-08-25 16:30
Lewis,

From you last message, it looks like you are missing the semicolon(;) from the end of the SQL statements.

For example, the SQL command that you should be running is

     SELECT * FROM core.containers WHERE Parent IS NULL;

and

     SELECT * FROM core.containers;


Can you try the command again, but this time with the semicolon and see if they return any rows.


-Brian
 
lewis j mcgibbney responded:  2015-08-25 17:01
Thanks guys.

labkey=# SELECT * FROM core.containers WHERE Parent IS NULL;
            _ts | rowid | entityid | createdby | created | parent | name | sortorder | s
earchable |
         description
                   | title | type
----------------------------+-------+--------------------------------------+-----------+-------------------------+--------+------------------------------------------------------------------+-----------+--
----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------+------------------------------------------------------------------+--------
 2015-02-10 02:46:37.194 | 1 | f2fca3f1-92fc-1032-9519-51af32c47241 | | 2015-02-10 02:46:37.194 | | | 0 | t
          |

                   | | normal
 2015-06-30 22:57:56.935152 | 9 | DA0000091 | | 2015-06-30 00:00:00 | | CC0016663 In Vivo Toxicology-MTD in Cynomolgus Monkeys (ET-1947) | 0 | t
          | CC0016663: A Single Intravenous Escalating Dose Study to Determine the Maximum Tolerated Dose in Cynomolgus Monkeys (ET-1947) Experimental Design: One male and one female cynomolgus monkey was
 administered vehicle (5% NMP/65% PEG-400/5% ethanol/12.5% 2-HP?CD/12.5% saline) or CC0016663 (Lot S00L05) via intravenous injection, as a single dose followed by a minimum of 5-day observation period bef
ore the next dose. | CC0016663 In Vivo Toxicology-MTD in Cynomolgus Monkeys (ET-1947) |
(2 rows)

Doing a lot better now. This DA0000091 number is certainly the culprit. It doesn't look like a valid entityid. I suspect that this is just a test entry someone on the team entered.
 
lewis j mcgibbney responded:  2015-08-25 17:04
And then when I execute the less restrictive query

labkey=# SELECT * FROM core.containers;
            _ts | rowid | entityid | createdby | created | parent | name
               | sortorder | searchable |
                                       description
                                                 | title | type
----------------------------+-------+--------------------------------------+-----------+-------------------------+--------------------------------------+---------------------------------------------------
---------------+-----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------+------------------------------------------------------------------+--------
 2015-02-10 02:46:37.194 | 1 | f2fca3f1-92fc-1032-9519-51af32c47241 | | 2015-02-10 02:46:37.194 | |
               | 0 | t |

                                                 | | normal
 2015-02-10 02:46:37.262 | 2 | f2fca3f2-92fc-1032-9519-51af32c47241 | | 2015-02-10 02:46:37.262 | f2fca3f1-92fc-1032-9519-51af32c47241 | home
               | 0 | t |

                                                 | | normal
 2015-02-10 02:46:37.549 | 3 | f2fca3f3-92fc-1032-9519-51af32c47241 | | 2015-02-10 02:46:37.549 | f2fca3f1-92fc-1032-9519-51af32c47241 | Shared
               | 0 | t |

                                                 | | normal
 2015-02-10 06:50:55.624 | 4 | f2fca4ae-92fc-1032-9519-51af32c47241 | | 2015-02-10 06:50:55.624 | f2fca3f2-92fc-1032-9519-51af32c47241 | support
               | 0 | t |

                                                 | | normal
 2015-02-10 06:52:45.826 | 5 | f2fca72b-92fc-1032-9519-51af32c47241 | 1003 | 2015-02-10 06:52:45.826 | f2fca3f1-92fc-1032-9519-51af32c47241 | Celgene Test
               | 0 | t |

                                                 | | normal
 2015-02-19 19:03:05.341 | 6 | 6f743c1b-9506-1032-b9bf-51af32c4d069 | 1003 | 2015-02-19 19:03:05.341 | f2fca3f1-92fc-1032-9519-51af32c47241 | HIV-CD4 Study
               | 0 | t |

                                                 | | normal
 2015-06-30 20:05:12.877 | 7 | 009863a0-f313-1032-a439-322dfdf29cd6 | 1010 | 2015-06-30 20:05:12.877 | f2fca3f1-92fc-1032-9519-51af32c47241 | Renato Lewis Test Study
               | 0 | t |

                                                 | | normal
 2015-06-30 20:06:44.951 | 8 | 009864d0-f313-1032-a439-322dfdf29cd6 | 1010 | 2015-06-30 20:06:44.951 | f2fca3f1-92fc-1032-9519-51af32c47241 | Test2
               | 0 | t |

                                                 | | normal
 2015-06-30 22:57:56.935152 | 9 | DA0000091 | | 2015-06-30 00:00:00 | | CC0016663 In Vivo Toxicology-MTD in Cynomolgus Mon
keys (ET-1947) | 0 | t | CC0016663: A Single Intravenous Escalating Dose Study to Determine the Maximum Tolerated Dose in Cynomolgus Monkeys (ET-1947) Experimental Design: One male and on
e female cynomolgus monkey was administered vehicle (5% NMP/65% PEG-400/5% ethanol/12.5% 2-HP?CD/12.5% saline) or CC0016663 (Lot S00L05) via intravenous injection, as a single dose followed by a minimum o
f 5-day observation period before the next dose. | CC0016663 In Vivo Toxicology-MTD in Cynomolgus Monkeys (ET-1947) |
 2015-06-30 23:05:14.295 | 10 | 00986ae4-f313-1032-a439-322dfdf29cd6 | 1010 | 2015-06-30 23:05:14.295 | f2fca3f1-92fc-1032-9519-51af32c47241 | TEST23
               | 0 | t |

                                                 | | normal
(10 rows)
 
adam responded:  2015-08-25 19:17
This is good... you need to delete that DA0000091 row and retry the upgrade.

Should be obvious at this point, but it's very important that no one inserts, updates, deletes, or performs any DDL on tables in the labkey schemas. If you need to, limit the permissions of most database users to read only.

Adam
 
lewis j mcgibbney responded:  2015-08-25 22:49
Hi Adam,
Deleting the offending row and attempting the upgrade seems to have done the trick.
Thank you all for contributing to this thread. Great help.
Lewis