In case of errors or other problems when installing and running LabKey Server, first review installation basics and options linked in the topic:
Install LabKey. This topic provides additional troubleshooting suggestions if the instructions in that topic do not resolve the issue.
LabKey Logs
From time to time Administrators may need to review the logs to troubleshoot system startup or other issues. The logs are located in the <LABKEY_HOME>/logs directory.
The logs of interest are:
- labkey.log: Once the application starts, logging of INFO, ERROR, WARN and other messages is recorded here.
- You can jump to times the server started up by searching the log for the string (__) which is part of the LabKey Server ASCII art startup banner.
- labkey-errors.log: Contains ERROR messages.
Log Rotation and Storage
To avoid massive log files and retain recent information with some context, log files are rotated (or "rolled-over") into new files to keep them manageable. These rotated log files are named with a trailing number and shuffled such that .1 will be the most recently rolled over log file, and higher numbers for any older logs of that type.
labkey.log: Rotated when the file reaches 10MB. Creates labkey.log.1, labkey.log.2, etc. Up to 7 archived files (80MB total).
labkey-errors.log: Rotated on server startup or at 100MB. Up to 3 labkey-errors.log files are rotated (labkey-errors.log.1, etc.) If the server is about to delete the first labkey-errors file from the current session (meaning it has generated hundreds of megabytes of errors since it started up), it will retain that first log file as labkey-errors-YYYY-MM-DD.log. This can be useful in determining a root cause of the many errors.
To save additional disk space, older log archives can be compressed for storage. A script to regularly store and compress them can be helpful.
Service Startup
LabKey Server is typically started and stopped with a service on either Linux or Windows. The methods differ for each platform, both for how to configure the service and how to run it.
For an error message like "The specified service already exists.", you may need to manually stop or even delete a failing or misconfigured service.
Here are some specific things to check.
Manually Create labkey-tmp Location First
The <LABKEY_HOME>/labkey-tmp directory must be created before you start your service, and the service user must have write permission there. If it does not exist, you may see an error similar to:
Caused by: org.springframework.boot.web.server.WebServerException: Unable to create tempDir. java.io.tmpdir is set to /usr/local/labkey/labkey/labkey-tmp
Linux Service Troubleshooting
On
Linux, if the service fails to start, the OS will likely tell you to review the output of:
systemctl status SERVICENAME.service output
usually:
systemctl status labkey_server.service output
...and also the logged info from:
Remember that any time you edit your service file, you need to reload the service daemon:
sudo systemctl daemon-reload
Privileged Ports
Ports 1-1023 are "privileged" on Linux. If you are using one of these ports, such as 443 for HTTPS or 80 for HTTP, you may see an error like the following during startup or upgrade, seeming to shut the service down as it is running.
WARN ServerApplicationContext [DATE TIME] main : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'webServerStartStop'
Grant the service access to privileged ports as described in this topic:
Other Processes RunningIf you see errors indicating the service is being "shut down mid-startup" and it is not related to using privileged ports, it may indicate that something else is running and shutting the server or service down. Check for other webapps running, other instances of tomcat, etc. For example, check the output of:
Different Java VersionsThe service will run as the 'labkey' user. When you are investigating, you may be 'root' or another user. Check that the java version is correct for the service user. For example, check that the expected java version is returned by:
Start Outside the Service
Another troubleshooting pathway for problems starting the service is to attempt to run the server outside the service. To do so, navigate to <LABKEY_HOME> and run an abbreviated version of the ExecStart line. Make sure to use your full path to Java. For example:
/labkey/apps/jdk-17/bin/java -Xms2G -Xmx2G -jar labkeyServer.jar
This method will attempt to unpack the jar and start the server using a minimal set of flags. You will not want to (or be able to) successfully use your server without the key flags and other parameters in the service file, but checking whether it can start this way may provide insight into service-file problems. This method also runs as 'root' instead of the 'labkey' user, so if (for example) the server will start successfully outside the service, but fails within it, the problem may be permissions related.
Windows Service Troubleshooting
The windows service will write a log to <LABKEY_HOME>/logs/"commons-daemon.*.log". If your service is not working as expected, check this log for further guidance.
Stop and Delete a ServiceIf you need to stop a running service on Windows, such as to retry, recreate the service, or if you see a message like "The specified service already exists", follow these steps:
Open
Control Panel > Administrative Tools > Services. Select the service and click
Stop.
To delete, run the following from the command line as an administrator:
Service Access Denied ExceptionAn error on service startup like the following may indicate that the labkeyServer.jar file was previously unpacked in the <LABKEY_HOME> location, possibly by a previous attempt or command-line run as a different user.
Caused by: java.nio.file.AccessDeniedException: C:\labkey\labkey\labkeywebapp
To resolve this, delete these directories that will be recreated by the service, then try the service again.
<LABKEY_HOME>/labkeywebapp
<LABKEY_HOME>/modules
<LABKEY_HOME>/logs
Version of prunsrvIf you install the service using the wrong version of prunsrv, it will successfully install but will not run. Running it will generate a "commons-daemon.*.log" file which will likely help identify this problem. Specifically, if you use the 32-bit version of prunsrv on a 64-bit machine, you may see errors like:
[2024-04-16 12:39:52] [error] ( javajni.c:300 ) [36592] Found 'C:\labkey\apps\jdk-17\bin\server\jvm.dll' but couldn't load it.
[2024-04-16 12:39:52] [error] ( javajni.c:300 ) [36592] %1 is not a valid Win32 application.
[2024-04-16 12:39:52] [debug] ( javajni.c:321 ) [36592] Invalid JVM DLL handle.
You must
delete the incorrect service, and return to copy the 64-bit version of prunsrv.exe. This is in the AMD64 subdirectory of the downloaded daemon bundle. Close and reopen your command window to pick up this change, then install using the correct version:
install_service.bat install
Service ControllerNote that you cannot successfully run the service from the command line (such as by using "prunsrv.exe //RS//labkeyServer"). Doing so may raise an error like:
Service could not connect to the service controller.
Instead, use "install_service.bat install" to install the service, then start and stop it from the Services panel. This will run the service as the correct user with the necessary permissions to connect.
Development Mode
Running a server in
development mode ("devmode") may change the behavior, provides additional logging, and can be helpful in troubleshooting some issues. DO NOT run your production server in devmode; use this mode on a staging, test, or development server only.
To check whether the server is running in devmode:
- Go to (Admin) > Site > Admin Console.
- Under Diagnostics, click System Properties.
- Check the value of the devmode property.
Learn more about setting and using devmode in this topic:
labkeyUpgradeLockFile
LabKey creates a "labkeyUpgradeLockFile" in the <LABKEY_HOME> directory before starting an upgrade and deletes it after upgrade has been successful. If this file is present on server startup, that means the previous upgrade was unsuccessful.
If you see an error similar to:
Lock file /labkey/labkey/labkeyUpgradeLockFile already exists - a previous upgrade attempt may have left the server in an indeterminate state. Proceed with extreme caution as the database may not be properly upgraded. To continue, delete the file and restart Tomcat.
Take these steps:
- Stop the server, if it is running.
- Carefully review the previous logs from any upgrade attempt to determine why that upgrade failed. Look for exceptions, configuration error messages, and other indications of why the server did not successfully reach the phase where this lock file should have been deleted.
- Resolve whatever issue caused the upgrade failure.
- Restore the database from the backup taken before the previous upgrade.
- Delete the "labkeyUpgradeLockFile"
- Start the server and monitor for a successful upgrade.
Configuration Issues
Application.Properties
1. Review your
application.properties file to confirm that all properties you expect to be in use are active. A leading # means the property is commented out and will not be applied. Also, double check that you have replaced any placeholder values in the file.
2. Check that the labkeyDataSource and any external data source properties match those expected of the current component versions.
3. Watch the <LABKEY_HOME>/logs for warning messages like
"Ignoring unknown property..." or "...is being ignored". For example, older versions of Tomcat used "maxActive" and "maxWait" and newer versions use "maxTotal" and "maxWaitMillis". Error messages like this would be raised if your datasources used the older versions:
17-Feb-2022 11:33:42.415 WARNING [main] java.util.ArrayList.forEach Name = labkeyDataSource Property maxActive is not used in DBCP2, use maxTotal instead. maxTotal default value is 8. You have set value of "20" for "maxActive" property, which is being ignored.
17-Feb-2022 11:33:42.415 WARNING [main] java.util.ArrayList.forEach Name = labkeyDataSource Property maxWait is not used in DBCP2 , use maxWaitMillis instead. maxWaitMillis default value is PT-0.001S. You have set value of "120000" for "maxWait" property, which is being ignored.
Compatible Component Versions
Confirm that you are using the supported versions of the required components, as detailed in the
Supported Technologies Roadmap. It is possible to have multiple versions of some software, like Java, installed at the same time. Check that LabKey and other applications in use are configured to use the correct versions.
For example, if you see an error similar to "...this version of the Java Runtime only recognizes class file versions up to ##.0..." it likely means you are using an unsupported version of the JDK.
Connection Pool Size
If your server becomes unresponsive, it could be due to the depletion of available connections to the database. Watch for a
Connection Pool Size of 8, which is the Tomcat connection pool default size and insufficient for a production server. To see the connection pool size for the LabKey data source, select
(Admin) > Site > Admin Console and check the setting of
Connection Pool Size on the
Server Information panel. The connection pool size for every data source is also logged at server startup.
To set the connection pool size, edit your application.properties file and change the "maxTotal" setting for your LabKey data source to at least 20. Depending on the number of simultaneous users and the complexity of their requests, your deployment may require a larger connection pool size. LabKey uses a default setting of 50.
You should also consider changing this setting for external data sources to match the usage you expect. Learn more in this topic:
External Schemas and Data Sources.
Increase HTTP Header Size Limit
Tomcat sets a size limit for the HTTP headers. If a request includes a header that exceeds this length, you will see errors like "Request Header is too large" in the response and server logs. As an example, a GET request with many parameters in the URL may exceed the default header size limit.
To increase the allowable header size, edit your application.properties file, adding this property set to a suitably high value:
server.max-http-request-header-size=65536
Deleted or Deactivated Users
Over time, you may have personnel leave the organization. Such user accounts should be deactivated and not deleted. If you encounter problems with accessing linked schemas, external schemas, running ETLs, or similar, check the logs to see if the former user may have "owned" these long term resources.
Conflicting Applications
If you have problems during installation, retry after shutting down all other running applications. Specifically, you may need to try temporarily shutting down any virus scanning application, internet security applications, or other applications that run in the background to see if this resolves the issue.
Filesystem Permissions
In order for files to be uploaded and logs to be written, the LabKey user account (i.e. the account that runs the service, often named 'labkey') must have the ability to write to the underlying file system locations.
For example, if the "Users" group on a windows installation has read but not write access to the site file root, an error message like this will be shown upon file upload:
Couldn't create file on server. This may be a server configuration problem. Contact the site administrator.
Browser Refresh for UI Issues
If menus, tabs, or other UI features appear to display incorrectly after upgrade, particularly if different browsers show different layouts, you may need to clear your browser cache to clear old stylesheets.
Diagnostics
Want to know which version of LabKey Server Is running?
- Find your version number at (Admin) > Site > Admin Console.
- At the top of the Server Information panel, you will see the release version.
Learn more and find more diagnostics to check in this topic:
HTTPS (TLS/SSL) Errors
Use Correct Property
Enabling HTTPS is done by including
"server.ssl.enabled=true" in the application properties. If you see the following error, you may have tried to set the nonexistent "ssl" property instead.
19-Feb-2022 15:28:29.624 INFO [main] java.util.ArrayList.forEach Name = labkeyDataSource Ignoring unknown property: value of "true" for "ssl" property
ERR_CONNECTION_CLOSED during Startup
An error like "ERR_CONNECTION_CLOSED" during startup, particularly after upgrading, may indicate that necessary files are missing. Check that you have the right configuration in application.properties, particularly the HTTPS directory containing the keys. Check the logs for "SEVERE" to find any messages that might look like:
19-Feb-2022 15:28:13.312 SEVERE [main] org.apache.catalina.util.LifecycleBase.handleSubClassException Failed to initialize component [Connector[HTTP/1.1-8443]]
org.apache.catalina.LifecycleException: Protocol handler initialization failed
...
org.apache.catalina.LifecycleException: The configured protocol [org.apache.coyote.http11.Http11AprProtocol] requires the APR/native library which is not available
Disable sslRequired
If you configure "SSL Required" and then cannot access your server, you will not be able to disable this requirement in the user interface. To resolve this, you can directly reset this property in your database.
Confirm that this query will select the "sslRequired" property row in your database:
SELECT * FROM prop.Properties WHERE Name = 'sslRequired' AND Set = (SELECT Set FROM prop.PropertySets WHERE Category = 'SiteConfig')
You should see a single row with a name of "sslRequired" and a value of "true". To change that value to false, run this update query:
UPDATE prop.Properties SET Value = FALSE WHERE Name = 'sslRequired' AND Set = (SELECT Set FROM prop.PropertySets WHERE Category = 'SiteConfig')
You can then verify with the SELECT query above to confirm the update.
Once the value is changed you'll need to restart the server to force the new value into the cache.
If you are using SQL Server, you'll need to double quote every reference to "Set" because that database considers Set a keyword.
HTTPS Handshake Issues
When accessing the server through APIs, including RStudio, rcurl, Jupyter, etc. one or more errors similar to these may be seen either in a client call or command line access:
SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed)
or
Peer reports incompatible or unsupported protocol version.
or
Timeout was reached: [SERVER:PORT] Operation timed out after 10000 milliseconds with 0 out of 0 bytes received
This may indicate that your server is set to use a more recent sslProtocol (such as TLSv1.3) than your client tool(s).
Client programs like RStudio, curl, and Jupyter may not have been updated to use the newer TLSv1.3 protocol which has timing differences from the earlier TLSv1.2 protocol. Check to see which protocol version your server is set to accept. To cover most cases, edit the application.properties section for HTTPS to accept both TLSv1.2 and TLSv1.3, and make the default TLSv1.2. These lines apply those settings:
server.ssl.enabled=true
server.ssl.enabled-protocols=TLSv1.3,TLSv1.2
server.ssl.protocol=TLSv1.2
Site Stuck in Maintenance Mode
In some circumstances, you may get into a state where even a site administrator cannot save changes to the Site Settings of a server using HTTPS. For example, if you are in ""Admin-only mode (only site admins may log in)" and try to turn that mode off allowing other users access, you could see an error in the UI like this:
Bad response code, 401 when connecting to the SSL port over HTTPS
This could indicate that there is a redirect between the ports configured for listening HTTPS traffic, and when the server (in admin-only mode) tries to validate the redirect, it cannot because the admin pages are not accessible. To resolve this, uncheck the box for "Require SSL Connections". This will allow you to save changes to the page, then when the server is no longer in admin-only mode, you will be able to recheck to require SSL (HTTPS) connections if desired.
Note that if you have not
configured your server to accept both HTTP and HTTPS traffic, you do not need to check that box requiring HTTPS connections, as the server will never actually receive the HTTP traffic.
Load Balancers
HTTPS Conflicts with Load Balancer or Proxy Use
Note that if you are using a Load Balancer or Proxy "in front of" LabKey, also enabling HTTPS may create conflicts as the proxy is also trying to handle the HTTPS traffic. This is not always problematic, as we have had success using Amazon's Application Load Balancer, but we have observed conflicts with versions from Apache, Nginx, and HA-Proxy. Review also the section about
WebSockets below.
If you configure HTTPS and then cannot access the server, you may need to disable it in the database directly in order to resolve the issue. Learn more
above.
WebSockets
LabKey Server uses WebSockets to push notifications to browsers. This includes a dialog telling the user they have been logged out (either due to an active logout request, or a session timeout). As WebSockets use the same port as HTTP and HTTPS, there is no additional configuration required if you are using Tomcat directly or embedded in LabKey Server. Note that while this is recommended for all LabKey Server installations, it is required for full functionality of the Sample Manager and Biologics products.
If you have a load balancer or proxy in front of LabKey/Tomcat, like Apache or NGINX, be sure that it is configured in a way that supports WebSockets. LabKey Server uses a "/_websocket" URL prefix for these connections.
When WebSockets are not configured, or improperly configured, you will see a variety of errors related to the inability to connect and failure to trigger notifications and alerts. LabKey Server administrators will see a banner reading "The WebSocket connection failed. LabKey Server uses WebSockets to send notifications and alert users when their session ends."
To confirm that WebSockets are working correctly, you can try:
- Use your browser's development tools to check the Network tab for websocket requests.
- Use the /query-apiTest.view? API Test Page to post to core-DisplayWarnings.api in a known location (such as http://localhost:8080/Tutorials/core-DisplayWarnings.api). Look for warnings related to websockets. If you see success:true then there are no warnings.
IP Addresses May Shift
If you are using a load balancer, including on a server in the LabKey cloud, be aware that AWS may shift over time what specific IP address your server is "on". Note that this includes LabKey's support site, www.labkey.org. This makes it difficult to answer questions like "What IP address am I using?" with any long term reliability.
In order to maintain a stable set of allowlist traffic sources/destinations, it is more reliable to pin these to the domain (i.e. labkey.org) rather than than to the IP address which may change within the AWS pool.
Database Issues
PostgreSQL Installation
You may need to remove references to Cygwin from your Windows system path before installing LabKey, due to conflicts with the PostgreSQL installer. The PostgreSQL installer also conflicts with some antivirus or firewalls. (see
http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows for more information).
Detect Attempt to Connect to In-Use Database
If you attempt to start the server with a database that is already in use by another LabKey Server as primary data source, serious complications may arise, including data corruption. For example, this might arise if a developer attempts to start a new development machine without changing the reference from a copied configuration file.
To prevent this situation, the server will check for existing application connections at startup. If there is a collision, the server will not start and the administrator will see a message similar to this:
There is 1 other connection to database [DATABASE_NAME] with the application name [APPLICATION_NAME]! This likely means another LabKey Server is already using this database.
If the check for this situation is unsuccessful for any reason, the message:
Attempt to detect other LabKey Server instances using this database failed.
For either error message, confirm that each LabKey Server will start up on a unique database in order to proceed. Learn more in the
internal issue description (account required).
Restart Installation from Scratch
If you have encountered prior failed installations, don't have any stored data you need to keep, and want to clean up and start completely from scratch, the following process may be useful:
- Delete the LabKey service (if it seems to be causing the failure).
- Uninstall PostgreSQL using their uninstaller.
- Delete the entire LabKey installation directory.
- Install LabKey again from the original binaries.
Graphviz
If you encounter the following error, or a similar error mentioning Graphviz:
Unable to display graph view: cannot run dot due to an error.
Cannot run program "dot" (in directory "./temp/ExperimentRunGraphs"): CreateProcess error=2, The system cannot find the file specified.
...you need to install it. Learn more at these links:
Further Support
Users of Premium Editions of LabKey Server can obtain support with installation and other issues by opening a ticket on their private
support portal. Your Account Manager will be happy to help resolve the problem.
All users can can search for issues resolved through community support in the
LabKey Support Forum.
If you don't see your issue listed in the community support forum, you can post a new question.
Supporting Materials
If the install seems successful, it is often helpful to submit
debugging logs for diagnosis.
If the install failed to complete, please include the zipped <LABKEY_HOME>/logs content, your application.properties file (with permissions redacted) and the service file you're using to start the server.
PostgreSQL logs its installation process separately. If PostgreSQL installation/upgrade fails, please locate and include the PostgreSQL install logs as well. Instructions for locating PostgreSQL install logs can be found here:
Related Topics