— Available in the Professional and Enterprise Editions of LabKey Server. Also available as an Add-on to the Starter Edition. Learn more
or contact LabKey
LabKey Server can integrate cloud storage
for management of large data files using Amazon S3 (Simple Storage Service). Support for other storage providers will be considered in the future. For more information about this feature and possible future directions, please contact LabKey
Currently, cloud storage services are best suited to providing an archive for large files. Some pipeline jobs can be run against files managed by cloud storage. This topic outlines the steps necessary to access data in an S3 bucket from the Files
web part on your server.
Cloud Data Storage Overview
Cloud Services offer the ability to upload and post large data files in the cloud, and LabKey Server can interface with this data allowing users to integrate it smoothly with other data for seamless use by LabKey analysis tools. In order to use these features, you must have installed the cloud
module in your LabKey Server. Contact your account manager for assistance if needed.
Cloud Storage services store data in buckets
which are typically limited to a certain number by user account, but can contain unlimited files. LabKey Server Cloud Storage uses a single bucket with a directory providing a pseudo-hierarchy so that multiple structured folders can appear as a multi-bucket storage system.
Learn more about Amazon S3 Buckets here: Working with Amazon S3 Buckets
AWS Identity Credentials
The identity and credential LabKey will use to access your S3 bucket are generated by creating an AWS Identity
On the AWS console click "Add User", provide a user name, select Programmatic Access, create a new group and give it AdministratorAccess. If AdministratorAccess is not possible, the detailed permissions required are listed later in this document
At the end of the wizard, you will be given an "Access key id" and a "Secret access key". Enter these in the Identity and Credentials fields when you create a Cloud Account
on LabKey Server in step 2 below.
Configure LabKey Server to use Cloud Storage
Each bucket on S3 that you plan to access will be defined as a Storage Config
on LabKey Server, accessed through a Cloud Account
. You will then be able to select which storage config to use on a per-folder basis.
Create Bucket (on AWS)
Before you can use your Cloud Storage account from within LabKey Server, you must first create the bucket you intend to use and the user account must have "list" as well as "upload/delete" permissions on the bucket. It is possible to have multiple cloud store services per account.
Create Cloud Account On LabKey Server
To access the bucket, you create a cloud account on your server, providing a named way to indicate the cloud credentials to use.
- Select (Admin) > Site > Admin Console.
- Under Premium Features, click Cloud Settings.
- If you do not see this option, you do not have the cloud module installed.
- Under Cloud Accounts, click Create Account.
- Enter an Account Name. It must be unique and will represent the login information entered here.
- Select a Provider.
- Enter your Identity and Credential. See AWS Identity above.
- Click Create.
This feature uses the encrypted property store for credentials and requires an administrator to provide a master encryption key in the labkey.xml file. LabKey will refuse to store credentials if a key is not provided. For instructions, see: Installation: SMTP, Encryption, LDAP, and File Roots
Create Storage Config (on LabKey Server)
Next define a Storage Config
, effectively a file alias pointing to a bucket available to your account. LabKey can create new subfolders in that location, or if you want to use a pre-existing S3 subdirectory within your bucket, you can specify it using the S3 Path
- Click Create Storage Config on the cloud account settings page under Cloud Store Service.
- If you navigated away, select (Admin) > Site > Admin Console. Under Premium Features, click Cloud Settings.
- Provide a Config Name. This name must be unique and it is good practice to base it on the S3 bucket that it will access.
- Select the Account you just created from the pulldown.
- Provide the S3 Bucket name itself. Do not include "S3://" or other elements of the full URL with the bucket name in this field. Learn more about bucket naming rules here
- Select Enabled.
- If you disable a storage config by unchecking this box, it will not be deleted, but you will be unable to use it from any container until enabling it again.
- S3 Path: (Optional) You can specify a path within the S3 bucket that will be the configuration root of any LabKey folder using this configuration. This enables use of an existing folder within the S3 bucket. If no path is specified, the root is the bucket itself.
- Directory Prefix: (Optional) Select whether to create a directory named <prefix><id> in the bucket or S3 path provided for this folder. The default prefix is "container".
- If you check the Directory Prefix box (default), LabKey will automatically create a subdirectory in the configuration root (the bucket itself or the S3 path provided above) for each LabKey folder using this configuration. For example, a generated directory name would be "container16", where 16 is the id number of the LabKey folder. You can see the id number for a given folder/container by going to Folder > Management > Information, or by querying the core.Containers table through the UI or an API. You may also find the reporting in Admin Console > Files helpful, as it will let you navigate the container tree and see the S3 URLs including the containerX values. Note that using this option means that the subdirectory and its contents will be deleted if the LabKey folder is deleted.
- If you do not check the box, all LabKey folders using this configuration will share the root location and LabKey will not delete the root contents when any folder is deleted.
- Click Create.
Authorized administrators will be able to use the Edit
link for defined storage configs for updating them.
Enable Cloud Storage in Folders and Projects
In each folder or project where you want to access cloud data, configure the filesystem to use the appropriate cloud storage config(s) you defined. Cloud storage at the project level can be inherited by folders within that project, or folders can override a project setting as needed.
Note that if a cloud storage config is disabled at the site-level it will not be possible to enable it within a folder or project.
- Navigate to the folder where you want to enable cloud storage and open (Admin) > Folder > Management.
- If you want to enable cloud storage at the project level, open (Admin) > Folder > Project Settings instead.
- Select the Files tab.
- Under Cloud Stores, enable the desired cloud stores using the checkboxes.
- Note that it's possible to disable a cloud storage config at the site level. If a config is not enabled at the site level, enabling it in a folder or project will have no effect.
- Click Save.
- After saving the selection of cloud stores for this container, you will be able to select one in the file root section higher on this page.
- Under File Root select Use cloud-based file storage and use the dropdown to select the desired cloud store.
- If you select this option before enabling the cloud store, you will see an empty dropdown.
- Existing Files: When you select a new file root for a folder, you will see the option Proposed File Root change from '<prior option>'. Select what you want to happen to any existing files in the root. Note that if you are not using directory containers, you will not be able to move files as they will not be deleted from the shared root. See Migrate Existing Files for details about file migration options.
- Click the Save button a second time.
Use Files from the Cloud
- Go to the Files web part in your folder.
- If you don't have one, select (Admin) > Page Admin Mode.
- From the dropdown in the lower left, click <Select Web Part> and choose Files.
- Click Add.
- Select Customize from the (triangle) menu in the Files web part header.
- Checking the box to make the Folder Tree visible is helpful.
- Select the @cloud directory, or any of its sub-directories, in the File Root pane.
- You can also give the webpart a descriptive title while you are in the customize interface.
- Click Submit.
The Files web part will now display the cloud storage files as if they are part of your local filesystem, as in the case of the .fcs file shown here:
The file is actually located in cloud storage as shown here:
When a download request for a cloud storage file comes through LabKey Server, the handle is passed to the client so the client can download the file directly.
Files uploaded to the S3 bucket independently of LabKey will appear in the LabKey Files web part.
Deleting Files from Cloud Storage
If you have configured cloud storage in LabKey
to create a new subdirectory (using Directory Prefix
) for each new folder created on LabKey, the files placed within it will be associated with the LabKey folder. If you delete the LabKey folder, the associated subfolder of your S3 bucket (and all files within it) will also be deleted.
If you instead configured cloud storage to use an existing S3 folder, any files placed there will be visible from within the LabKey folder, but will NOT
be deleted if the LabKey folder is deleted.
S3 Permissions Required
The detailed permissions required for S3 access are listed below. Substitute your bucket name where you see BUCKET_NAME.
Additionally, if ACLs are defined on individual objects within a bucket, the user will need READ and READ_ACP permission to each object for read-only usage, and WRITE and WRITE_ACP for write usage.
See more information about S3 permissions
in the AWS documentation.
If you are interested in learning more about the future directions for this functionality, please contact LabKey