Separating Database from Webserver?

General Server Forum (Inactive)
Separating Database from Webserver? olnerdybastid  2018-02-08 16:59
Status: Closed
 
Background: We are in the planning stages of building a platform for managing clinical and genetic data that will contain sensitive PHI and are considering LabKey for our project.

Having worked through the LabKey tutorials, I’m familiar with the safeguards in place for restricting access to PHI. However, for security reasons we are not sure that we want PHI to reside on a server that is exposed to the internet. If I’ve understood the docs correctly, storing our PHI in an external database would present some limitations in how we’re able to make queries, since Joins and Lookups across internal and external data sources are not supported and defining the core LabKey Server schemas as external schemas seems to be strongly discouraged

In light of all this, I’m wondering what options we’d have that would allow us to make use of the front-end capabilities and hopefully the schema designs already implemented in LabKey while keeping our database separate from the web server/how other developers have approached this?
 
 
Jon (LabKey DevOps) responded:  2018-02-13 14:07
Hello,

Sounds like a significant undertaking you're doing! I know that Jason reached out to you directly to see how we can work something out with you at-large and I hope you get back to him since LabKey is sophisticated enough to handle a lot of data analysis.

Regarding your question: If you were to setup an External Schema, you are still technically exposing that data on the internet if the server is public facing. The LabKey database does not have to reside on the same box as the webserver if you choose to keep things separated.

The way to handle things when it comes to a public-facing instance of LabKey to where you can protect your data is to use some of our security measures in conjunction with whatever measures you have already implemented on your end with your network.

- Setup a Master Encryption Key so the database is encrypted: https://www.labkey.org/Documentation/wiki-page.view?name=cpasxml#encrypt
- Setup your webserver to run on SSL: https://www.labkey.org/Documentation/wiki-page.view?name=configTomcat#7
- Configure your groups so only certain individuals have Admin rights or enough rights to any one given container: https://www.labkey.org/Documentation/wiki-page.view?name=security
- Make sure CSRF checking is enabled for all POST requests in the Site Settings of the LabKey instance: https://www.labkey.org/Documentation/wiki-page.view?name=csrfProtection
- Limit what containers on your LabKey instance is accessible by guest users. If the Guests group isn't enabled on any container, then no one can access the LabKey instance without a username and password. So even with a public facing LabKey instance, no data would be exposed without actual credentials to access the site.
- Add a robots.txt file to your Tomcat server to prevent it from being crawled by search engines that respect the robots.txt file: https://www.labkey.org/Documentation/wiki-page.view?name=robots
- Disable the option for self-signup, preventing people from creating their own user accounts: https://www.labkey.org/Documentation/wiki-page.view?name=authenticationModule#sso
- Enable authentication using LDAP, SAML, or some other SSO option to only allow valid users from gaining access to the LabKey instance: https://www.labkey.org/Documentation/wiki-page.view?name=authenticationModule

You can also make the server only accessible within your network/VPN, but it would make it no longer public.

Additionally, we also have a slidedeck on Best Practices for LabKey Administrators that we held from our 2016 LabKey User Conference (https://www.labkey.org/Documentation/wiki-page.view?name=lkuc2016) here that might be useful to you:

https://labkey.org/Documentation/wiki-download.view?entityId=32d70f27-ed56-1034-b734-fe851e088836&name=LKUC2016_Admin_Workshop_Slides.pdf

Regards,

Jon