Table of Contents

guest
2022-05-17
     Files
       Tutorial: File Repository
         Step 1: Set Up a File Repository
         Step 2: File Repository Administration
         Step 3: Search the Repository
         Step 4: Import Data from the Repository
       Using the Files Repository
       View and Share Files
       Controlling File Display via the URL
       Import Data from Files
       Linking Assays with Images and Other Files
       Linking Data Records to Image Files
       File Metadata
       File Administrator Guide
         Files Web Part Administration
         File Root Options
           Troubleshoot Pipeline and Files
         File Terminology
         Transfer Files with WebDAV
       Enterprise Pipeline
         JMS Queue
         RAW to mzXML Converters
         Configure LabKey Server to use the Enterprise Pipeline
           Configure the Conversion Service
           Configure Remote Pipeline Server
         Use the Enterprise Pipeline
         Troubleshoot the Enterprise Pipeline
       File Transfer Module / Globus File Sharing
       S3 Cloud Data Storage
         AWS Identity Credentials
         Configure Cloud Storage
         Use Files from Cloud Storage
         Cloud Storage for File Watchers

Files


LabKey Server provides both a file repository and a database for securely storing, sharing and integrating your information.
  • Browser-based, secure upload of files to LabKey Server where you can archive, search, store, and share.
  • Structured import of data into the LabKey database, either from files already uploaded or from files stored outside the server. The imported data can be analyzed and integrated with other data.
  • Browser-based, secure sharing and viewing of files and data on your LabKey Server.

Basic Functions

Once files have been uploaded to the repository, they can be securely searched, shared and viewed, or downloaded. Learn the basics in this tutorial:

More Files Topics

Scientific Functions

Once data has been imported into the database, team members can integrate it across source files, analyze it in grid views (using sort, filter, charting, export and other features), perform quality control, or use domain specific tools (eg., NAb, Flow, etc.). The basic functions described above (search, share, download and view) remain available for both the data and its source files.

Application Examples

Members of a research team might upload experimental data files into the repository during a sequence of experiments. Other team members could then use these files to identify a consistent assay design and import the data into the database later in a consistent manner. Relevant tutorials:

Alternatively, data can be imported into the database in a single step bypassing individual file upload. Relevant tutorials:

Related Topics




Tutorial: File Repository


LabKey Server makes it easy to centralize your files in a secure and accessible file repository. This tutorial shows you how to set up and use some of the LabKey Server file management tools.

This tutorial can be completed using a free 30-day trial version of LabKey Server.

Background: Problems with Files

Researchers and scientists often have to manage large numbers of files with a wide range of sizes and formats. Some of these files are relatively small, such as spreadsheets containing a few lines of data; others are huge, such as large binary files. Some have a generic format, such as tab-separated data tables; while others have instrument-specific, proprietary formats, such as Luminex assay files -- not to mention image-based data files, PowerPoint presentations, grant proposals, longitudinal study protocols, and so on.

Often these files are scattered across many different computers in a research team, making them difficult to locate, search over, and consolidate for analysis. Worse, researchers often share these files via email, which puts your data security at risk and can lead to further duplication and confusion.

Solution: LabKey Server File Repository

LabKey Server addresses these problems with a secure, web-accessible file repository, which serves both as a searchable storage place for files, and as a launching point for importing data into the database (for integration with other data, querying, and analysis).

In particular, the file repository provides:

  • A storage, indexing and sharing location for unstructured data files like Word documents. The search indexer scans and pulls out words and phrases to enable finding files with specific content.
  • A launching point for structured data files (like Excel files or instrument-generated files with a specific format), that can be imported into the LabKey Server database for more advanced analysis. For example, you can select files in the repository, and import them into assay designs or process the files through a script pipeline.
  • A staging point for files that are opaque to the search indexer, such as biopsy image files.
This tutorial shows you how to set up and use a LabKey Server file repository that handles all three of these file types.

Tutorial Steps

First Step




Step 1: Set Up a File Repository


To begin, we'll set up the file repository user interface. The file repository has a web-based interface that users can securely interact with online, through a web browser. Only those users you have explicitly authorized will have access to the file repository. After the user interface is in place, we upload our data-bearing files to the repository.

Google Chrome is the recommended browser for this step.

Set up a File Repository

First we will set up the workspace for your file repository, which lets you upload, browse, and interact with files.

  • Log in to your server and navigate to your "Tutorials" project. Create it if necessary.
    • If you don't already have a server to work on where you can create projects, start here.
    • If you don't know how to create projects and folders, review this topic.
  • Create a new subfolder named "File Repository". Accept all defaults.

  • Enter > Page Admin Mode.
  • Using the (triangle) menus in the corner of each web part, you can Remove from page the Subfolders, Wiki, and Messages web parts; they are included in a default "Collaboration" folder but not used in this tutorial.
  • Click Exit Admin Mode.

Upload Files to the Repository

With the user interface in place, you can add content to the repository. For the purposes of this tutorial we have supplied a variety of demo files.

  • Download LabKeyDemoFiles.zip.
  • Unzip the folder to the location of your choice.
  • Open an explorer window on the unzipped folder LabKeyDemoFiles and open the subfolders.
  • Notice that the directory structure and file names contain keywords and metadata which will be captured by the search indexer.
LabKeyDemoFiles
API
ReagentRequestTutorial
ReagentRequests.xls
Reagents.xls
Assays
Elispot
AID_datafile.txt
CTL_datafile.xls
Zeiss_datafile.txt
Generic
GenericAssayShortcut.xar
GenericAssay_BadData.xls
GenericAssay_Run1.xls
...
  • Drag and drop the unzipped folder LabKeyDemoFiles, onto the target area on the File Repository.
  • Notice the progress bar displays the status of the import.
  • Click the (Toggle Folder Tree) button on the far left to show the folder tree.
  • When uploaded, the LabKeyDemoFiles folder should appear at the root of the file directory, directly under the fileset node.

Securing and Sharing the Repository (Optional)

Now you have a secure, shareable file repository. Setting up security for the repository is beyond the scope of this tutorial. To get a sense of how it works, go to (Admin) > Folder > Permissions. The Permissions page lets you grant different levels of access, such as Reader, Editor, Submitter, etc., to specified users or groups of users. Uncheck the "Inherit permissions from parent" box if you wish to make changes now. For details on configuring security, see Tutorial: Security. Click Cancel to return to the main folder page.

Start Over | Next Step (2 of 4)




Step 2: File Repository Administration


In the previous tutorial step, you created and populated a file repository. Users of the file repository can browse and download files, but administrators of the repository have an expanded role. When you created your own folder for this tutorial, you automatically received the administrator role.

As an administrator, you can:

  • Add and delete files.
  • Customize which actions are exposed in the user interface.
  • Audit user activity, such as when users have logged in and where they have been inside the repository.
We will begin by interacting with the repository as an ordinary user would, then we will approach the repository as an administrator with expanded permissions.

Browse and Download Files (Users)

  • Click the toggle to show the folder tree on the left, if it is not already visible.
  • Note that if there are more buttons visible than fit across the panel, the button bar may overflow to a >> pulldown menu on the right.
  • Click into the subfolders and files to see what sort of files are in the repository.
  • Double-click an item to download it (depending on your browser settings, some types of files may open directly).

Customize the Button Bar (Admins)

You can add, remove, and rearrange the buttons in the Files web part toolbar. Both text and icons are optional for each button shown.

  • Return to the File Repository folder if you navigated away.
  • In the Files web part toolbar, click Admin.
    • If you don't see the admin button, look for it on a >> pulldown menu on the right. This overflow menu is used when there are more buttons shown than can fit across the window.
  • Select the Toolbar and Grid Settings tab.
  • The Configure Toolbar Options shows current toolbar settings; you can select whether and how the available buttons (listed on the right) will be displayed.
  • Uncheck the Shown box for the Rename button. Notice that unsaved changes are marked with red corner indicators.
  • You can also drag and drop to reorder the buttons. In this screenshot, the parent folder button is being moved to the right of the refresh button.
  • Make a few changes and click Submit.
  • You may want to undo your changes before continuing, but it is not required as long as you can still see the necessary buttons. To return to the original file repository button bar:
    • Click Admin in the Files web part toolbar.
    • Click the Toolbar and Grid Settings tab.
    • Click Reset to Default.

Configure Grid Column Settings (Admins)

Grid columns may be hidden and reorganized using the pulldown menu on the right edge of any column header, or you can use the toolbar Admin interface. This interface offers control over whether columns can be sorted as well.

  • In the Files web part toolbar, click Admin.
  • On the Toolbar and Grid Settings tab, scroll down to Configure Grid Column Settings.
  • Using the checkboxes, select whether you want each column to be hidden and/or sortable.
  • Reorder columns by dragging and dropping.
  • Click Submit when finished.
  • If needed, you may also Reset to Default.

Audit History (Admins)

The Audit History report tells an administrator when each file was created or deleted and who executed the action.

  • In the Files web part, click Audit History.

In this case you will see when you uploaded the demo files.

Related Topics

Previous Step | Next Step (3 of 4)




Step 3: Search the Repository


Files in the repository, both structured and unstructured, are indexed using the full-text search scanner. This is different from, and complimentary to, the search functionality provided by SQL queries, covered in this topic: Search

In this tutorial step you will search your files using full-text search and you will add tags to files to support more advanced search options.

Add Search User Interface

Search the Data

  • Enter "serum" into the search box.
  • The search results show a variety of documents that contain the search term "serum".
  • Click the links to view the contents of these documents.
  • Try other search terms and explore the results. For example, you would see an empty result for "Darwin" or "Mendel" before you complete the next part of this tutorial.
  • Click Advanced Options for more options including:
    • specifying the desired project and folder scope of your search
    • narrowing your search to selected categories of data

File Tagging

In many cases it is helpful to tag your files with custom properties to aid in searching, for example, when the desired search text is not already part of the file itself. For example, you might want to tag files in your repository under their appropriate project code names, say "Darwin" and "Mendel", and later retrieve files tagged for that project.

To tag files with custom properties, follow these steps:

Define a 'Project' Property

  • Click the File Repository link.
  • In the Files web part header, click Admin.
    • If you don't see it, try the >> pulldown menu or check your permissions.
  • Select the File Properties tab.
  • Select Use Custom File Properties.
  • Click Edit Properties.
  • If you have a prepared JSON file containing field definitions, you can select it here. Otherwise, click Manually Define Fields.
  • Click Add Field to add a new property row.
  • In the Name field, enter "Project". Leave the Text type selected.
  • Click Save.

Learn more about adding and customizing fields in this topic: Field Editor.

Apply the Property to Files

  • Open the folder tree toggle and expand directories.
  • Select any two files using their checkboxes.
  • Click (Edit Properties). It may be shown only as a wrench-icon. (Hover over any icon-only buttons to see a tooltip with the text. You might need to use the files web part Admin > Toolbar and Grid Settings interface to make it visible.)
  • In the Project field you just defined, enter "Darwin" for the first file.
  • Click Next.
  • In the Project field, enter "Mendel".
  • Click Save.

Retrieve Tagged Files

  • In the Search web part, enter "Darwin" and click Search. Then try "Mendel".
  • Your tagged files will be retrieved, along with any others on your server that contain these strings.

Turn Search Off (Optional)

The full-text search feature can search content in all folders where the user has read permissions. There may be cases when you want to disable global searching for some content which is otherwise readable. For example, you might disable searching of a folder containing archived versions of documents so that only the more recent versions appear in project-wide search results.

  • To turn the search function off in a given folder, first navigate to it.
  • Select (Admin) > Folder > Management, and click the Search tab.
  • Remove the checkmark and click Save.

Note that you can still search from within the folder itself for content there. This feature turns off global searches from other places.

Related Topics

Previous Step | Next Step (4 of 4)




Step 4: Import Data from the Repository


The file repository serves as a launching point for importing assay data to the database. First, notice the difference between uploading and importing data:
  • You already uploaded some data files earlier in the tutorial, adding them to the file system.
  • When you import a file into LabKey Server, you add its contents to the database. This makes the data available to a wide variety of analysis/integration tools and dashboard environments inside LabKey Server.
In this last step in the tutorial, we will import some data of interest and visualize it using an assay analysis dashboard.

Import Data for Visualization

First, import some file you want to visualize. You can pick a file of interest to you, or follow this walkthrough using the sample file provided.

  • Return to the main page of the File Repository folder.
  • In the Files web part, locate or upload the data file. To use our example, open the folders LabKeyDemoFiles > Assays > Generic.
  • Select the file GenericAssay_Run1.xls.
  • Click Import Data.
  • In the Import Text or Excel Assay, select Create New General Assay Design.
  • Click Import.
  • Enter a Name, for example, "Preliminary Lab Data".
  • Select "Current Folder (File Repository)" as the Location.
  • Notice the import preview area below. Let's assume the inferred columns are correct and accept all defaults.
  • Click Begin Import, then Next to accept the default batch properties, then Save and Finish.
  • When the import completes, you see the "list" of runs, consisting of the one file we just imported.
  • Click View Results to see the detailed view of the assay data.

Finished

Congratulations, you've finished the File Repository tutorial and learned to manage files on your server.

Previous Step




Using the Files Repository


LabKey Server makes it easy to upload and use your files in a secure and accessible file repository. Users work with files in a Files web part, also known as a file browser. This topic covers setting up and populating the file browser. You can also learn to use files in the Tutorial: File Repository.

Create a File Browser (Admin)

An administrator can add a file browser to any folder or tab location.

  • Navigate to the desired location.
  • Enter > Page Admin Mode.
  • In the lower left, select Files from Add Web Part and click Add.
  • Click Exit Admin Mode.

There is also a file browser available in the narrow/righthand column of webparts. It offers the same drag and drop panel, but the menu bar is abbreviated, concealing the majority of menu options behind a Manage button.

Drag and Drop Upload

The Files web part provides a built-in drag-and-drop upload interface. Open a browser window and drag the desired file or files onto the drag-and-drop target area.

Folders, along with any sub-folders, can also be uploaded via drag-and-drop. Uploaded folder structure will be reproduced within the LabKey file system. Note that empty folders are ignored and not uploaded or created.

While files are uploading, a countdown of files remaining is displayed in the uploader. This message disappears on completion.

Single File Upload Button

If drag and drop is not suitable, you can also click the Upload Files button which will let you browse to Choose a File. You can also enter a Description via this method. Click Upload to complete.

Create Folder Structure in the File Repository

When you import a files in a folder/subfolder structure, that structure is retained, but you can also create new folders directly in the file browser. In the file browser, select the parent folder, or select no parent to create the new folder at the top level. Click the Create Folder button:

Enter the name of the new folder in the popup dialog, and click Submit:

Note that folder names must follow these naming rules:

  • The name must not start with an 'at' character: @
  • The name must not contain any of the following characters: / \ ; : ? < > * | " ^
If you would like to create multiple distinct folder structures within a single folder, make use of named file sets.

Show Absolute File Paths

Administrators can grant users permission to see absolute file paths in the file browser on a site wide basis.

To show the column containing the file paths to users granted the necessary permission:
  • In the file browser, click Admin.
  • Click the Toolbar and Grid Settings tab.
  • Scroll down to the Configure Grid Column Settings section.
  • Uncheck the box for Hidden next to Absolute File Path (permission required).

Users who have not been given the See Absolute File Paths role will see only an empty column.

Related Topics




View and Share Files


This topic describes how users can view and share files that have been uploaded to the files repository. Collaborating with a shared file browser makes it easier to ensure the entire team has access to the same information.

Add Files to the Repository

Learn more about creating a file repository and uploading files in this topic: Using the Files Repository.

Learn about using the data processing pipeline to upload on a schedule, in bulk, or with special processing in this topic: Data Processing Pipeline. Pipeline tasks can also be run on files already uploaded to the repository as they are imported into the database providing further processing and analysis.

File Preview

If you hover over the icon for some types of files in the Files web part, a pop-up showing a preview of the contents will be displayed for a few seconds. This is useful for quickly differentiating between files with similar names, such as in this image showing numbered image files.

Select Files

To select a single file, use the checkbox on the left. You can also click the file name to select that file.

Many actions will operate on multiple files. You can check multiple boxes, or multiselect a group of files by selecting the first one, then shift-clicking the name of the last one in the series you want to select.

File Display in Browser

Double-clicking will open some files (i.e. images, text files including some scripts, etc) directly, depending on browser settings. Other files may immediately download.

File Download Link

For flexible sharing of any file you have uploaded to LabKey Server, you can make a download link column visible.

  • Navigate to the Files web part showing the file.
  • Make the Download Link column visible using the (triangle) menu for any current column, as shown here:
  • Right click the link for the file of interest and select Copy link address.
  • The URL is now saved to your clipboard and might look something like:
http://localhost:8080/labkey/_webdav/Tutorials/File%20Repository/%40files/LabKeyDemoFiles/Datasets/Demographics.xls?contentDisposition=attachment
  • This URL can be shared to provide your users with a link to download the file of interest.

Users will need to have sufficient permissions to see the file. For information on adjusting permissions, please see Security.

For file types that can be displayed in the browser, you can edit the download to display the content in the browser by removing this portion of the URL:

?contentDisposition=attachment

For other display options that can be controlled via the URL, see Controlling File Display via the URL.

Note that this simple method for showing the Download Link column is not permanent. To configure your files web part to always display this column, use the file browser Admin menu, Toolbar and Grid Settings tab instead.

An alternative way to get the download link for a single file:

  • Click the title of the Files web part to open the manage page.
  • On the Manage Files page, place a checkmark next to the target file.
  • At the bottom of the page, copy the WebDav URL.

Get the Base URL

To get the base URL for the File Repository, you can select the fileset item on the Manage Files page, or from anywhere in the folder, go to (Admin) > Go To Module > FileContent. The base URL is displayed at the bottom of the File Repository window, as shown below:

Link to a Specific Directory in the File Repository

To get the link for a specific directory in the repository, navigate to that directory and copy the value from the WebDAV URL field. You'll notice the subdirectories are appended to the base folder URL. For example, the following URL points to the directory "LabKeyDemoFiles/Assays/Generic" in the repository.

Use 'wget' to Transfer Files from the Command Line

You can use command line methods such as 'wget' to access file system files, given the path location. For example, if there is a file in your File Share browser called "dataset.xls", you (as an authorized user) could download that file via the wget command like this, substituting the path to your @files folder:

wget --user=[LABKEY_USER_ACCOUNT_NAME] --ask-password https://www.labkey.org/_webdav/ClientProject/File%20Share/%40files/dataset.xls

This would prompt you to provide your password for labkey.org and the file called dataset.xls will download to your computer in whatever folder you are are in at the time you ran the command.

If you have a file to upload to your file browser, you can also use wget to simplify this process. In this example, user "John Smith" wants to deliver a file named "dataset.xls" to his "ClientProject" support portal. He could run the following command, remembering to correct the path name for his own client portal name and name of his file share folder:

wget --method=PUT --body-file=/Users/johnsmith/dataset.xls --user=[johnsmith's email address] --ask-password https://www.labkey.org/_webdav/ClientProject/File%20Share/%40files/dataset.xls -O - -nv

He would be prompted for his password for labkey.org, and then the file would go from the location he designated straight to the file repository on labkey.org. Remember that users must already be authorized to add files to the repository (i.e. added as users of the support portal in this example) for this command to work.

Related Topics




Controlling File Display via the URL


Files in a LabKey file repository can be displayed, or rendered, in different ways by editing the URL directly. This grants you flexibility in how you can use the information you have stored.

This example begins with an HTML file in a file repository. For example, this page is available at the following URL:

Render As

By default, this file will be rendered inside the standard server framing:

To render the content in another way, add the renderAs parameter to the URL. For example to display the file without any framing, use the following URL

Possible values for the renderAs parameter are shown below:

URL ParameterDescriptionExample
renderAs=FRAMECause the file to be rendered within an IFRAME. This is useful for returning standard HTML files.Click for example
renderAs=INLINERender the content of the file directly into a page. This is only useful if you have files containing fragments of HTML, and those files link to other resources on the LabKey Server, and links within the HTML will also need the renderAs=INLINE to maintain the look.Click for example
renderAs=TEXTRenders text into a page, preserves line breaks in text files.Click for example
renderAs=IMAGEFor rendering an image in a page.Click for example
renderAs=PAGEShow the file unframed.Click for example

Named File Sets

If the target files are in a named file set, you must add the fileSet parameter to the URL. For example, if you are targeting the file set named "store1", then use a URL like the following:

Related Topics




Import Data from Files


Once you have uploaded files to LabKey Server's file repository, you can import the data held in these files into LabKey's database via the Files web part. After import, data can be used with a wide variety of analysis and visualization tools.

Import Data from Files

Before you can import data files into LabKey data structures, you must first upload your files to the LabKey file system using the Files web part, pipeline, or other method.

After you have uploaded data files, select a file of interest and click the Import Data button in the Files web part.

In the Import Data pop up dialog, select from available options for the type of data you are importing.

Click Import to confirm the import.

Some data import options will continue with additional pages requesting more input or parameters.

Related Topics




Linking Assays with Images and Other Files


The following topic explains how to automatically link together assay data with images and other file types in the File repository. When imported assay data contains file names and/or paths, the system can associate these file names with actual files residing on the server. Files are resolved for both assay results and assay run properties. Assay results file can be either TSV or Excel formats.

For Standard (previously known as "GPAT") assays, files will be automatically associated with assay result runs provided that:

  • The files have been uploaded to a location on the LabKey Server file repository.
  • The assay design is of type Standard
  • The file names and paths are captured in a field of type File.
  • The file names resolve to a single file
  • Paths are either full or relative paths to the pipeline or file repository root.
File names will be associated with the actual images in the following scenarios:
  • On import:
    • When assay data is imported from the file browser.
    • When assay data is uploaded via the assay upload wizard (in this case, we use the pipeline root).
    • When an assay run is created via experiment-saveBatch.api.
    • When an assay run is created via assay-importRun.api.
  • On update:
    • When updating assay data via query-updateRows.api.
    • When updating assay data via experiment-saveBatch.api.
    • When updating assay data via the build-in browser-based update form.
Note that image files are exported with assay results as follows:
  • when exported as Excel files, images appear inline.
  • when exported as .tsv files, the file name is shown.

Example

Assume that the following image files exist in the server's file repository at the path scans/patient1/CT/

When the following assay result file is imported...

ParticipantIdDateScanFile
10012/12/2017scans/patient1/CT/ct_scan1.png
10012/12/2017scans/patient1/CT/ct_scan2.png
10012/12/2017scans/patient1/CT/ct_scan3.png

...the imported assay results will resolve and link to the files in the repository. Note the automatically generated thumbnail in the assay results:

API Example

File association also works when importing using the assay API, for example:

LABKEY.Experiment.saveBatch({

assayId : 444,
batch : {
runs : [{
properties : {
sop : 'filebrowser.png'
},
dataRows : [{
specimenId : '22',
participantId : '33',
filePath : 'cat.png'
}]
}]
},
success : function(){console.log('success');},
failure : function(){console.log('failure');}

});

LABKEY.Assay.importRun({
assayId: 444,
name: "new run",
properties: {
"sop": "assayData/dog.png"
},
batchProperties: {
"Batch Field": "value"
},
dataRows : [{
specimenId : '22',
participantId : '33',
filePath : 'cat.png'
},{
specimenId : '22',
participantId : '33',
filePath : 'filebrowser.png'
},{
specimenId : '22',
participantId : '33',
filePath : 'assayData/dog.png'
}],
success : function(){console.log('success');},
failure : function(){console.log('failure');}
});

Related Topics




Linking Data Records to Image Files


This topic explains how to link data grids to image files which reside either on the same LabKey Server, or somewhere else on the web.

This feature can be used with any LabKey data table as an alternative to using either the File or Attachment field types.

For assay designs, see the following topic: Linking Assays with Images and Other Files.

Scenario

Suppose you have a dataset where each of row of data refers to some image or file. For example, you have a dataset called Biopsies, where you want each row of data to link to an image depicting a tissue section. Images will open in a new tab or download, depending on your browser settings.

Below is an example Biopsies table:

Biopsies

ParticipantIdDateTissueTypeTissueSlide
PT-10110/10/2010Liverslide1.jpg
PT-10210/10/2010Liverslide2.jpg
PT-10310/10/2010Liverslide3.jpg

How do you make this dataset link to the slide images, such that clicking on slide1.jpg shows the actual image file?

Solution

To achieve this linking behavior, follow these steps:

  • Upload the target images to the File Repository.
  • Create a target dataset where one column contains the image names.
  • Build a URL that links from the image names to the image files.
  • Use this URL in the dataset.
Detailed explanations are provided below:

Upload Images to the File Repository

  • Navigate to your study folder.
  • Go to (Admin) > Go To Module > File Content.
  • Drag-and-drop your files into the File Repository. You can upload the images directly into the root directory, or you can upload the images inside a subfolder. For example, the screenshot below, shows a folder called images, which contains all of the slide JPEGs.
  • Acquire the URL to your image folder: In the File Repository, open the folder where your images reside, and scroll down to the WebDav URL.
  • Open a text editor, and paste in the URL, for example:
https://myserver.labkey.com/_webdav/myproject/%40files/images/

Create a Target Dataset

Your dataset should include a column which holds the file names of your target images. See "Biopsies" above for an example.

For details on importing a study dataset, see Import Datasets.

To edit an existing dataset to add a new column, follow the steps in this topic: Dataset Properties.

Build the URL

To build the URL to the images, do the following:

  • In your dataset, determine which column holds the image names. In our example the column is "TissueSlide".
  • In a text editor, type out this column name as a "substitution token", by placing it in curly brackets preceded by a dollar sign, as follows:
${TissueSlide}
  • Append this substitution token to the end of the WebDav URL in your text editor, for example:
https://myserver.labkey.com/_webdav/myproject/%40files/images/${TissueSlide}

You now have a URL that can link to any of the images in the File Repository.

Use the URL

  • Go the Dataset grid view and click Manage.
  • Click Edit Definition.
  • Click the Fields section.
  • Use to expand the field which holds the image names, in this example, "TissueSlide".
  • Into the URL field, paste the full URL you just created, including the substitution token.
  • Click Save.
  • Click View Data.
  • Notice that the filenames in the TissueSlide field are now links. Click a link to see the corresponding image file. It will open in a new tab or download, depending on your browser.

If you prefer that the link results in a file download, add the following to the end of the URL in the dataset definition.

?contentDisposition=attachment

Resulting in this total URL on the Display tab for the TissueSlide field.

https://myserver.labkey.com/_webdav/myproject/%40files/images/${TissueSlide}?contentDisposition=attachment

If you prefer that the link results in viewing the file in the browser, add the following:

?contentDisposition=inline

Related Topics




File Metadata


When files are added to the File Repository, metadata about each file is recorded in the table exp.Data. The information about each file includes:
  • File Name
  • URL
  • Download Link
  • Thumbnail Link
  • Inline Thumbnail image
  • etc...

View the Query Browser

To access the table, developers and admins can:

  • Go to Admin > Go to Module > Query.
  • Click to open the schema exp and the query Data.
  • Click View Data.
By default, the name, runID (if applicable) and URL are included. You can use the (Grid views) > Customize Grid option to show additional columns or create a custom grid view.

Create a Query Web Part

To create a grid that shows the metadata columns, do one of the following:

  • Create a custom view of the grid exp.Data, and expose the custom view by adding a Query web part.
    • Or
  • Create a custom query using exp.Data as the base query, and add the columns using SQL. For example, the following SQL query adds the columns InlineThumbnail and DownloadLink to the base query:
SELECT Data.Name,
Data.Run,
Data.DataFileUrl,
Data.InlineThumbnail,
Data.DownloadLink
FROM Data

Pulling Metadata from File Paths

You can pull out more metadata, if you conform to a path naming pattern. In the following example, CT and PET scans have been arranged in the following folder pattern:

scans
├───patient1
│ ├───CT
│ └───PET
└───patient2
├───CT
└───PET

The following SQL query harvests out the patient id and the scan type from the directory structure above.

SELECT Data.Name as ScanName,
Data.DataFileUrl,
split_part(DataFileUrl, '/', 8) as Ptid,
split_part(DataFileUrl, '/', 9) as ScanType
FROM Data
WHERE locate('/@files/scans', DataFileUrl) > 0

The result is a table of enhanced metadata about the files, which provides new columns you can filter on, to narrow down to the files of interest.

Related Topics




File Administrator Guide


This section includes guides and background information useful to administrators working with files in LabKey Server.

File Administration Topics

Related Topics




Files Web Part Administration


Administrators can customize the Files web part in the following ways, allowing them to present the most useful set of features to their file browser users. To customize the Files web part, click the Admin button on the toolbar.

There are four tabs for the various customizations possible here:

Customize Actions Available

The Actions tab lets you control the availability of common file action and data import options such as importing datasets, creating assay designs, etc.

Note that file actions are only available for files in the pipeline directory. If a pipeline override has been set, or the file browser is showing non-file content (such as in a "@fileset" directory), these actions will not be available in the default file location.

You can change what a Files web part displays using the (triangle) menu; choose Customize. Select a file content directory to enable actions on this tab.

The Import Data button is always visible to admins, but you can choose whether to show it to non-admin users. For each other action listed, checking the Enabled box makes it available as a pipeline job. Checking Show in Toolbar adds a button to the Files web part toolbar.

Define File Tagging Properties

The File Properties tab lets you define properties that can be used to tag files. Each file browser can use the default (none), inherit properties from the parent container, or use custom file properties.

Once a custom property is defined, users are asked to provided property values when files are uploaded. There is a built in 'Comment' file property that is included when other custom properties are in use.

  • To define a property, select Use Custom File Properties and then click the Edit Properties button.
  • If no properties have been defined, you have the option to import field definitions from a prepared JSON file. Otherwise, click Manually Define Fields.
  • Click Add Field to add new properties. Details about customizing fields are available in this topic: Field Editor.
  • To reuse properties defined in the parent folder, select Use Same Settings as Parent.

Tagged files can be retrieved by searching on their property value. For more detail, see Step 3: Search the Repository.

Control Toolbar Buttons and Grid Settings

The Toolbar and Grid Settings tab controls the appearance of the file management browser.

Configure Toolbar Options: Toolbar buttons are in display order, from top to bottom; drag and drop to rearrange. Available buttons which are not currently displayed are listed at the end. Check boxes to show and hide text and icons independently.

Configure Grid Columns Settings (scroll down): lists grid columns in display order from top to bottom; drag and drop to rearrange. The columns can be reorganized by clicking and dragging their respective rows, and checkboxes can make columns Hidden or Sortable.

You can also change which columns are displayed directly in the Files web part. Pulldown the arrow in any column label, select Columns and use checkboxes to show and hide columns. For example, this screenshot shows adding the Download Link column:

Local Path to Files

Use the following procedure to find the local path to files that have been uploaded via the Files web part.

  • Go to (Admin) > Site > Admin Console.
  • Under Configuration, click Files.
  • Under Summary View for File Directories, locate and open the folder where your Files web part is located.
  • The local path to your files appears in the Directory column. It should end in @files.

General Settings Tab

Use the checkbox to control whether to show the file upload panel by default. The file upload panel offers a single file browse and upload interface and can be opened with the Upload Files button regardless of whether it is shown by default.

You may drag and drop files into the file browser region to upload them whether or not the panel is shown.

Note that in Premium Editions of LabKey Server, file upload can be completely disabled if desired. See below for details.

Configure Default Email Alerts

To control the default email notification for the folder where the Files web part resides, see Manage Email Notifications. Email notifications can be sent for file uploads, deletions, changes to metadata properties, etc. Admins can configure each project user to override or accept the folder's default notification setting at the folder level.

Project users can also override default notification settings themselves by clicking the (Email Preferences) button in the Files web part toolbar. SElect your own preference and click Submit.

Disable File Upload

Premium Feature — This feature is available in Premium Editions of LabKey Server. Learn more or contact LabKey.

Site Administrators can disable file upload across the entire site. When activated, file upload to both the file and pipeline roots will be disabled.

When file upload is disabled, the Upload button in the File Repository is hidden. Also, dragging and dropping files does not trigger file upload. Users (with sufficient permissions) can still perform the following actions in the File Repository, including: download, rename, delete, move, edit properties, and create folder.

Note that attaching files to issues, wikis, or messages is not disabled by this setting.

Changes to this setting are logged under the Site Settings Events log.

To disable file upload site-wide:

  • Go to (Admin) > Site > Admin Console.
  • Under Configuration, click Files.
  • Place a checkmark next to Disable file upload.
  • Click Save

Related Topics


Premium Resource Available

Subscribers to premium editions of LabKey Server can configure scanning of uploaded files for viruses. Learn more in this topic:


Learn more about premium editions




File Root Options


LabKey Server provides tools for securely uploading, processing and sharing your files. If you wish, you can override default storage locations for each project and associated subfolders by setting site-level or project-level file roots.

Topics

Summary View of File Roots and Overrides

You can view an overview of settings and full paths from the "Summary View for File Directories" section of the "Configure File System Access" page that is available through (Admin) > Site > Admin Console > Configuration > Files.

File directories, named file sets and pipeline directories can be viewed on a project/folder basis through the "Summary View." The 'Default' column indicates whether the directory is derived from the site-level file root or has been overridden. To view or manage files in a directory, double click on a row or click on the 'Browse Selected' button. To configure an @file or an @pipeline directory, select the directory and click on the 'Configure Selected' button in the toolbar.

If you add a pipeline override for any folder, we don't create a @files folder in the filesystem. The server treats an override as a user-managed location and will use the path specified instead.

Note that a @pipeline marker is used in the "Summary View for File Directories", available through (Admin) > Site > Admin Console > Configuration > Files. However, there is no corresponding @pipeline directory on the file system. The summary view uses the @pipeline marker simply to show the path for the associated pipeline.

Site-Level File Root

The site-level file root is the top of the directory structure for files you upload. By default it is under the LabKey Server installation directory, but you may choose to place it elsewhere if required for backup, permissions, or disk space reasons.

During server setup, a directory structure is created mirroring the structure of your LabKey Server projects and folders. Each project or folder is a directory containing a "@files" subdirectory. Unless the site-level root has been overridden at the project or folder level, files will be stored under the site-level root.

You can specify a site-level file root at installation or access the "Configure File System Access" page on an existing installation.

  • Select (Admin) > Site > Admin Console.
  • Under Configuration, click Files.

Change the Site-Level File Root

When you change the site-level file root for an existing installation, files in projects that use fileroots based on that site-level file root will be automatically moved to the new location. The server will also update paths in the database for all of the core tables. If you are storing file paths in tables managed by custom modules, the custom module will need register an instance of org.labkey.api.files.FileListener with org.labkey.api.files.FileContentService.addFileListener(), and fix up the paths stored in the database within its fileMoved() method.

Files located in projects that use pipeline overrides or in folders with their own project- or folder-level file roots will not be moved by changing the site-level file root. If you have set project-level roots or pipeline overrides, files in these projects and their subfolders must be moved separately. Please see Troubleshoot Pipeline and Files for more information.

Changes to file roots are audited under Project and Folder events.

Project-level File Roots

You can override the site-level root on a project-by-project basis. A few reasons you might wish to do so:

  • Separate file storage for a project from your LabKey Server. You might wish to enable more frequent backup of files for a particular project.
  • Hide files. You can hide files previously uploaded to a project or its subfolders by selecting the "Disable File Sharing" option for the project.
  • Provide a window into an external drive. You can set a project-level root to a location on a drive external to your LabKey Server.
From your project:
  • Select (Admin) > Folder > Project Settings.
  • Click the Files tab.

Changes to file roots are audited under Project and Folder events.

Folder-level File Roots

The default file root for a folder is a subfolder of the project file root plus the folder name. If the project-level root changes, this folder-level default will also change automatically to be under the new project-level root.

To set a custom file root for a single folder, follow these steps:

From your folder:
  • Select (Admin) > Folder > Management.
  • Click the Files tab.

Changes to file roots are audited under Project and Folder events.

Migrate Existing Files

When you change the site-level file root for an existing installation, the entire directory tree of files located under that site-level file root are automatically moved to the new location (and deleted in the previous location).

When you select a new project-level (or folder-level) file root, you will see the option "Proposed File Root change from '<prior option>'." Select what you want to happen to any existing files in the previous root location. The entire directory tree, including any subfolders within the file root are included in any copy or move.

Options are:

  • Not copied or moved: Default
  • Copied to the new location: The existing files stay where they are. A copy of the files is placed in the new file root and database paths are updated to point to the new location.
  • Moved to the new location: Files are copied to the new location, database paths are updated to point to this location, then the original files are deleted.
Note: All work to copy or move the entire directory tree of files is performed in a single pipeline job. The job will continue even if some files fail to copy, but if any file copy fails, no deletion will occur from the original location (i.e. a move will not be completed).

File Root Options

The directory exposed by the Files web part can be set to any of the following directories:

  • @files (the default)
  • @pipeline
  • @filesets
  • @cloud
  • any children of the above
Administrators can select which directory is exposed by clicking the (triangle) on the Files web part and selecting Customize. In the File Root pane, select the directory to be exposed and click Submit. The image below shows how to expose the sub-directory Folder A.

Alternative _webfiles Root

Administrators can enable an alternative WebDAV root for the whole server. This alternative webdav root, named "_webfiles", displays a simplified, file-sharing oriented tree that omits non-file content (like wikis), and collapses @file nodes into the root of the container’s node.

To access or mount this root go to a URL like the following (replacing my.labkeyserver.com with your real server domains):

This URL will expose the server's built-in WebDAV UI. 3rd party WebDAV clients can mount the above URL just like they can mount the default _webdav root.

To enable this alternative webdav root:

  • Select (Admin) > Site > Admin Console.
  • Under Configuration, click Files.
  • Under Alternative Webfiles Root, place a checkmark next to Enable _webfiles.
  • Click Save.

The _webfiles directory is parallel to the default _webdav directory, but only lists the contents under @files and its child containers. @pipeline, @filesets, and @cloud contents are not accessible from _webfiles.

Any name collisions between between containers and file system directories will be handled as follows:

Child containers that share names (case-insensitive) with file system directories will take precedence and be displayed with their names unchanged in the WebDAV tree. File system directories will be exposed with a " (files)" suffix. If there are further conflicts, we will append "2", "3", etc, until we find an unused name. Creating a subdirectory via WebDav always create a child file directory, never a child container.

Map Network Drive (Windows Only)

LabKey Server runs on a Windows server as an operating system service, which Windows treats as a separate user account. The user account that represents the service may not automatically have permissions to access a network share that the logged-in user does have access to. If you are running on Windows and using LabKey Server to access files on a remote server, for example via the LabKey Server pipeline, you'll need to configure the server to map the network drive for the service's user account.

Configuring the network drive settings is optional; you only need to do it if you are running Windows and using a shared network drive to store files that LabKey Server will access.

  • Select (Admin) > Site > Admin Console.
  • Under Configuration, click Files.
  • Under Map Network Drive, click Configure.

Drive letter: The drive letter to which you want to assign the network drive.

Path: The path to the remote server to be mapped using a UNC path -- for example, a value like "\\remoteserver\labkeyshare".

User: Provide a valid user name for logging onto the share; you can specify the value "none" if no user name or password is required.

Password: Provide the password for the user name; you can specify the value "none" if no user name or password is required.

Named File Sets

Named file sets are additional file stores for a LabKey web folder. They exist alongside the default file root for a web folder, enabling web sharing of files in directories that do not correspond exactly to LabKey containers. You can add multiple named file sets for a given LabKey web folder, displaying each in its own web part. The server considers named file sets as "non-managed" file systems, so moving either the site or the folder file root does not have any effect on named file sets. File sets are a single directory and do not include any subdirectories.

To add a named file root:

  • On the Files web part, click the (triangle) and select Customize.
  • On the Customize Files page, click Configure File Roots.
  • Under File Sets, enter a Name and a Path to the file directory on your local machine.
  • Click Add File Set.
  • Add additional file sets are required.
  • To display a named file set in the Files web part, click the (triangle) on the Files web part, and select Customize.
  • Open the "@filesets" node, select your named file set, and click Submit.
  • The Files web part will now display the files in your named file set.

For details on URL parameters used with named file sets, see Controlling File Display via the URL.

For an example showing how to display a named file set using the JavaScript API, see JavaScript API - Examples.

Related Topics




Troubleshoot Pipeline and Files


This topic includes some common troubleshooting steps to take when you encounter issues with uploading and importing data via the pipeline or file browser.

Troubleshooting Basics

When trying to identify what went wrong on your LabKey Server, a few general resources can help you:

  1. Check the error log.
  2. Review troubleshooting documentation.
  3. Check the labkey log.
  4. Review the audit log for actions taking place when you encountered the problem.
For troubleshooting issues with uploading files in particular, a good first step is to check the permissions of the target directory in the file system directory itself. Check both the files directory where your site-level file root points (LABKEY_HOME/files by default) and the location where logs will be written (generally CATALINA_HOME/logs). LabKey Server must have the ability to write files to these locations.

Pipeline Import Error Log

When you import files or archives using drag and drop or other pipeline import methods, you will see the Data Pipeline import page. Status messages will show progress. Upon completion you may either encounter an error or notice expected information is missing.

If there is an Error during import, the Status column will generally read "ERROR", though even if it does not, you may want to check these logs. There may be helpful details in the Info column. Click the value in the Status column for full details about the pipeline job.

The Folder import page provides a Job Status panel, followed by a log file panel giving a summary of the full log. Scan the panel for details about what went wrong, typically marked by the word Error or Warning and possibly highlighted in red. If this information does not help you identify the problem, click Show Full Log File for the full file. You can also click links to View or Download the log file in the Job Status web part.

XAR Import Errors

The log file is the first place to look if import of a XAR (xar.xml) file fails.

Notes specific to XAR files:

  • The most common problem is a duplicate LSID problem. In example 1 of the XAR Tutorial, the LSIDs have fixed values. This means that this xar.xml can only be imported in one folder on the whole site. If you are sharing access to a LabKey Server system with other user of that tutorial you will encounter this problem. Subsequent examples in the tutorial will show you how to address this conflict.
  • A second common problem is clashing LSID objects at the run level. If an object is created by a particular ProtocolApplication and then a second ProtApp tries to output an object with the same LSID, an error will result.
  • LabKey Server does not offer the ability to delete protocols or starting inputs in a folder, except for deleting the entire folder. This means that if you import a xar.xml in a folder and then change a protocol or starting input without changing its LSID, you won't see your changes. The XarReader checks first to see if the protocols in a xar.xml have already been defined, and if so will silently use the existing protocols rather than the (possibly changed) protocol descriptions in the xar.xml. See example 3 in the XAR Tutorial for a suggestion of how to avoid problems with this.
  • Sometimes a xar.xml will appear to import correctly but report an error when you try to view the summary graph. This is likely related to problems in referencing the Starting Inputs.

Files Not Visible

If you do not see the expected set of files in a Files web part, check the following. Some of these options are inherited by subfolders, so it may not be immediately clear that they apply to your context.

You can check for unexpected settings at each level as follows:
  • Files Web Part: Click the (triangle) menu in the web part corner and select Customize. See if the expected file root is being displayed.
  • Folder: (Admin) > Folder > Management > Files tab
  • Project: (Admin) > Folder > Project Settings > Files
  • Site: (Admin) > Site > Admin Console > Configuration > Files.

Import Data Button Not Available

If you are using a pipeline override for a folder that differs from the file directory, you will not see an Import Data button in the Files web part. You may either to change project settings to use the default site-level file root or you can import files via the Data Pipeline instead of the Files web part. To access the pipeline UI, go to: (Admin) > Go to Module > Pipeline.

Project-level Files Not Moved When Site-wide Root is Changed

When the site-wide root changes for an existing installation, files in projects that use fileroots based on that site-level file root will be automatically moved to the new location. The server will also update paths in the database for all of the core tables. If you are storing file paths in tables managed by custom modules, the custom module will need register an instance of org.labkey.api.files.FileListener with org.labkey.api.files.FileContenService.addFileListener(), and fix up the paths stored in the database within its fileMoved() method.

Files located in projects that use pipeline overrides or in folders with their own project- or folder-level file roots will not be moved by changing the site-level file root. If you have set project-level roots or pipeline overrides, files in these projects and their subfolders must be moved separately.

User Cannot Upload Files When Pipeline Override Set

In general, a user who has the Editor or Author role for a folder should be able to upload files. This is true for the default file management tool location, or file attachments for issues, wikis, messages, etc.

The exception is when you have configured a folder (or its parent folders) to use a pipeline override. In that case, you will need to explicitly assign permissions for the pipeline override directory.

To determine whether a pipeline override is set up, and to configure permissions if so, follow these steps:

  • Navigate to the folder in question.
  • Select (Admin) > Go to Module > Pipeline.
  • Click Setup.
  • If the "Set a pipeline override" option is selected, you have two choices:
    • Keep the override and use the choices under the Pipeline Files Permissions heading to set permissions for the appropriate users.
    • Remove the override and use normal permissions for the folder. Select "Use a default based on the site-level root" instead of a pipeline override.
  • Adjust folder permissions if needed using (Admin) > Folder > Permissions.

For further information, see Set a Pipeline Override.

Related Topics




File Terminology


This topic defines commonly used terminology for files in LabKey Server.

Root vs. directory. A root generally implies is the top level of inheritance throughout a tree of directories. A directory identifies just one spot in the file system.

LabKey installation directory. The default directory in the file system for LabKey Server that contains folders for files, modules, the webapp, etc. Sometimes referred to as [LABKEY_HOME]. (Example: /dev/labkey/labkeyHome or C:\dev\labkey\labkeyHome\)

Site-level file root. The directory in the LabKey Server's file system that contains your server's file directory structure (Example: /data/labkey/files). It can be set, typically at install time. This location is called a root, not just a directory, because it determines where the tree of file directories lives, not just the location of a single directory. The structure reflects your server's tree of projects and folders. See: File Root Options.

File directory. The specific location on the file system where files associated with a particular project or folder are placed. (Example: /data/labkey/files/project1/folder1/@files, where the folder of interest is folder1, a child of project1.) See: File Root Options.

File root. The directory in LabKey Server's file system that contains your project's file directory structure. (Example: /data/labkey/otherdata/project1/) This structure contains file directories in subfolders that match the structure of your project. See: File Root Options.

File root override. A custom destination for files for a given project. Can be set at the project level only. (Example: /data/labkey/otherdata/project1/myfiles). If a file root override is set, this root determines the the location of the tree of file directories for that project's subfolders. See: File Root Options.

Data processing pipeline. Provides a set of actions that can be performed against a given file. See: Data Processing Pipeline.

Pipeline override. An explicitly set location for files that are subject to actions. Also determines the location where files are uploaded when uploaded via the pipeline UI. Allows security to be set separately vs. the default, folder-specific permissions. See Set a Pipeline Override.




Transfer Files with WebDAV


Use a WebDAV client as an alternative to the native LabKey Server interfaces for accessing files on LabKey Server. WebDAV allows you to read, modify and delete files on the server. You can use either a 3rd party WebDAV client, such as Cyberduck, or, once properly configured, you can use Windows Explorer or OSX without installing any new software. You can also enable a site-wide WebDAV root that provides a more user-friendly user interface: see File Root Options.

Example Setup for Cyberduck WebDAV Client

To set up Cyberduck to access a file repository on LabKey Server, follow these instructions:

  • First, get the WebDAV URL for the target repository:
    • On LabKey Server, go to the target file repository.
    • Click the title of the Files web part.
    • The URL used by WebDAV appears at the bottom of the screen.
  • Alternatively, administrators can get the WebDAV URL directly from the Files web part as follows:
    • Open the Upload Files panel, then click the (file upload help) icon.
    • The File Upload Help dialog appears.
    • The URL used by WebDAV appears in this dialog. Copy the URL to the clipboard.
  • Set up Cyberduck (or another 3rd party WebDAV client).
    • Click Open Connection (or equivalent in another client).
    • Enter the URL and your username/password.
    • Click Connect.
    • You can now drag-and-drop clients into the file repository using the 3rd party WebDAV client.

Tested 3rd Party clients

  • CyberDuck: GUI WebDAV client.
  • WebDrive: Integrates with Explorer and allows you to mount the LabKey Server to a drive letter.
  • NetDrive: Integrates with Explorer and allows you to mount the LabKey Server to a drive letter.
  • cadaver: Command line tool. Similar to FTP

Native Windows WebDAV Client (WebDAV Redirector)

A WebDAV client called "WebDAV Redirector" is built into Windows 8 and Windows 10. Assuming your server is configured to use SSL, you can connect from Windows directly to a LabKey Server file repository. Configuring the WebDAV Redirector to work over non-SSL connections is not recommended.

Note that the WebDAV Redirector is limited to 50MB by default.

To connect, you can use the Window Explorer to map a network drive to the file repository URL, using the URL shown below.

To connect using a Windows Command prompt, use "net use". For example:

net use Y: https://hosted.labkey.com/labkey/_webdav/myProject/myFolder/@files/ /USER:johndoe@labkey.com * /PERSISTENT:YES

Explanation of the command above:

Command line itemDescription
Y:The drive letter that will allow the client to copy multiple files to the LabKey Server using familiar Windows commands . It can’t be in use at the time; if it is, either choose a different drive letter or issue a net use Y: /D command first to disconnect the Y: drive.
https://hosted.labkey.com/labkey/_webdav/myProject/myFolder/@files/The URL to the WebDAV root. Use double quotes if there are spaces in the URL. (To get this URL, see the screen shot above.)
_webdavThis component of the URL applies to all WebDAV connections into LabKey Server.
myProjectThe LabKey Server project name.
myFolderThe folder name within the project - the location of the Files web part.
@filesThe directory root for the file content. This folder is viewed by the Files web part in a LabKey Server folder. Files managed by the pipeline component appear under a root directory called @pipeline.
johndoe@labkey.comthe same user email you would use to sign into LabKey Server from a browser.
*Causes Windows to prompt for your LabKey password.
/PERSISTENT:YESCauses Windows to remember the drive letter mapping between restarts of the system.

Once you’ve mapped a drive letter to LabKey Server, you can use COPY, REN, XCOPY and other standard Windows command to move data files between the client and LabKey Server.

The mapped network drive feature is accessible in the Windows File Explorer There is a button for "Map Network Drive" above the files/folders list. ( On Windows 8, make sure the "This PC" node in the left panel Windows Explorer has selected the "This PC" node in the left hand pane. )

You can now use Windows Explorer to drag-and-drop files into the @files directory on the server.

Native OSX WebDAV Client

When using OSX, you do not need to install a 3rd party WebDAV client. You can mount a WebDAV share via the dialog at Go > Connect to Server. Enter a URL of the form:

https://<username%40domain.com>@<www.sitename.org>/_webdav/<projectname>/

To have the URL generated for you, see the instructions above for Cyberduck.

  • <username%40domain.com> - The email address you use to log in to your LabKey Server, with the @ symbol replaced with %40. Example: Use username%40labkey.com for username@labkey.com
  • <www.sitename.org> - The URL of your LabKey Server. Example: www.mysite.org
  • <projectname> - The name of the project you would like to access. If you need to force a login, this project should provide no guest access. Example: _secret

Linux WebDAV Clients

Tested clients:

  • Gnome Desktop: Nautilus file browser can mount a WebDAV share like an NFS share.
  • KDE Desktop: Has drop down for mounting a WebDAV share like an NFS share.
  • cadaver: Command line tool. Similar to FTP.

Related Topics




Enterprise Pipeline


The Enterprise Pipeline is a special configuration of the data processing pipeline. Instead of running all of the tasks on the same machine as your LabKey Server instance, it runs some of them on a remote pipeline server. While many examples in this section use mass spectrometry scenarios, the enterprise pipeline is not specific to that application.

Topics

Assumptions

This documentation assumes the LabKey Server and the Enterprise Pipeline will be configured to work in the following architecture:

  • All files (both sample files and result files from searches) will be stored on a Shared File System
  • LabKey Server will mount the Shared File System.
    • Some third party tools may require a Windows installation of LabKey Server, but a LabKey remote server can be deployed on any platform that is supported by LabKey Server itself.
  • Conversion of RAW files to mzXML format will be included in the pipeline processing
    • The remote server running the conversion will mount the Shared File System
  • MS2 pipeline analysis tools (X!Tandem, TPP, etc) can be executed on a remote server
    • Remote servers will mount the Shared File System

Prerequisites

Install prerequisites for using the Enterprise Pipeline:




JMS Queue


The pipeline requires a JMS Queue to transfer messages between the different pipeline services. The LabKey Server currently supports the ActiveMQ JMS Queue from the Apache Software Foundation.

JMS: Installation Steps

  1. Choose a server on which to run the JMS Queue
  2. Install Java
  3. Install and Configure ActiveMQ
  4. Test the ActiveMQ Installation
ActiveMQ supports all major operating systems (including Windows, Linux, and OSX). It is common, but not required, for it to be deployed on the LabKey Server web server. For this documentation we will assume you are installing on a Linux based server.

Install Java

  1. Install the Java version supported by your server. For details see Supported Technologies.
  2. Create the JAVA_HOME environmental variable to point at your installation directory.

Install and Configure ActiveMQ

Note: LabKey currently supports ActiveMQ 5.1.0 only.

Download and Unpack the distribution

  1. Download ActiveMQ from ActiveMQ's download site
  2. Unpack the binary distribution from into /usr/local
    1. This will create /usr/local/apache-activemq-5.1.0
  3. Create the environmental variable <ACTIVEMQ_HOME> and have it point at /usr/local/apache-activemq-5.1.0

Configure logging for the ActiveMQ server

To log all messages sent through the JMSQueue, add the following to the <broker> node in the config file located at <ACTIVEMQ-HOME>/conf/activemq.xml

<plugins>
<!-- lets enable detailed logging in the broker -->
<loggingBrokerPlugin/>
</plugins>

During the installation and testing of the ActiveMQ server, you might want to show the debug output for the JMS Queue software. You can enable this by editing the file <ACTIVEMQ-HOME>/conf/log4j.properties

uncomment

#log4j.rootLogger=DEBUG, stdout, out

and comment out

log4j.rootLogger=INFO, stdout, out

Authentication, Management and Configuration

  1. Configure JMX to allow us to use Jconsole and the JMS administration tools monitor the JMS Queue
  2. We recommend configuring Authentication for your ActiveMQ server. There are number of ways to implement authentication. See http://activemq.apache.org/security.html
  3. We recommend configuring ActiveMQ to create the required Queues at startup. This can be done by adding the following to the configuration file <ACTIVEMQ-HOME>/conf/activemq.xml
<destinations>
<queue physicalName="job.queue" />
<queue physicalName="status.queue" />
</destinations>

Start the server

To start the ActiveMQ server, you can execute the command below. This command will start the ActiveMQ server with the following settings
    • Logs will be written to <ACTIVEMQ_HOME>/data/activemq.log
    • StdOut will be written to /usr/local/apache-activemq-5.1.0/smlog
    • JMS Queue messages, status information, etc will be stored in <ACTIVEMQ_HOME>/data
    • job.queue Queue and status.queue will be durable and persistant. (I.e., messages on the queue will be saved through a restart of the process.)
    • We are using AMQ Message Store to store Queue messages and status information
To start the server, execute

<ACTIVEMQ_HOME>/bin/activemq-admin start xbean:<ACTIVEMQ_HOME>/conf/activemq.xml > <ACTIVEMQ_HOME>/smlog 2>&1 &

Monitoring JMS Server, Viewing JMS Queue Configuration and Viewing Messages on a JMS Queue

Using the ActiveMQ management tools

Browse the messages on queue by running

<ACTIVEMQ_HOME>/bin/activemq-admin browse --amqurl tcp://localhost:61616 job.queue

View runtime configuration, usage and status of the server information by running

<ACTIVEMQ_HOME>/bin/activemq-admin query

Related Topics




RAW to mzXML Converters


These instructions explain how to manually install the LabKey Enterprise Pipeline MS2 Conversion service. The Conversion service is used to convert the output of MS2 machines to the mzXML format which is used by the LabKey Server.
Please note the Conversion Service is optional, and only required if you plan to convert files to mzXML format in your pipeline.

Installation Requirements

  1. Choose a Windows-based server to run the Conversion Service
  2. Install Java
  3. Install ProteoWizard. (ReAdW.exe is also supported for backward compatibility for ThermoFinnigan instruments.)
  4. Test the Converter Installation

Choose a Server to Run the Conversion Service

The Conversion server must run the Windows Operating System (Vendor software libraries currently only run on the Windows OS).

Install Java

Download the Java version supported by your server. For details see Supported Technologies.

Install the Vendor Software for the Supported Converters

Currently LabKey Server supports the following vendors

  • ThermoFinnigan
  • Waters
Install the software following the instructions provided by the vendor.

Install ProteoWizard

Download the converter executables from the ProteoWizard project

Install the executables, and copy them into the directory <LABKEY_HOME>\bin directory

  1. Create the directory c:\labkey to be the <LABKEY_HOME> directory
  2. Create the binary directory c:\labkey\bin
  3. Place the directory <LABKEY_HOME>\bin directory on the PATH System Variable using the System Control Panel
  4. Install the downloaded files and copy the executable files to <LABKEY_HOME>\bin

Test the Converter Installation

For the sake of this document, we will use an example of converting a RAW file using the msconvert. Testing for other vendor formats is similar.

  1. Choose a RAW file to use for this test. For this example, the file will be called convertSample.RAW
  2. Place the file in a temporary directory on the computer. For this example, we will use c:\conversion
  3. Open a Command Prompt and change directory to c:\conversion
  4. Attempt to convert the sample RAW file to mzXML using msconvert.exe. Note that the first time you perform a conversion, you may need to accept a license agreement.
C:\conversion> dir
Volume in drive C has no label.
Volume Serial Number is 30As-59FG

Directory of C:\conversion

04/09/2008 12:39 PM <DIR> .
04/09/2008 12:39 PM <DIR> ..
04/09/2008 11:00 AM 82,665,342 convertSample.RAW

C:\conversion>msconvert.exe convertSample.RAW --mzXML
format: mzXML (Precision_64 [ 1000514:Precision_64 1000515:Precision_32 ], ByteOrder_LittleEndian, Compression_None) indexed="true"
outputPath: .
extension: .mzXML
contactFilename:

filters:

filenames:
convertSample.raw

processing file: convertSample.raw
writing output file: ./convertSample.mzXML

C:\conversion> dir
Volume in drive C has no label.
Volume Serial Number is 20AC-9682

Directory of C:\conversion

04/09/2008 12:39 PM <DIR> .
04/09/2008 12:39 PM <DIR> ..
04/09/2008 11:15 AM 112,583,326 convertSample.mzXML
04/09/2008 11:00 AM 82,665,342 convertSample.RAW

Resources

Converters.zip




Configure LabKey Server to use the Enterprise Pipeline


This topic covers configuring LabKey Server to use the Enterprise Pipeline. Take note to complete the steps in the prerequisites section. If necessary, you will also create the LabKey Tool directory where programs such as the MS2 analysis tools will be installed.

Assumptions

The Enterprise Pipeline allows work to be distributed to remote servers, separate from the LabKey Server web server. The following assumptions and prerequisites apply:

  • Use of a Network File System: The LabKey web server and remote servers must be able to mount the following resources:
    • Pipeline directory (location where mzXML, pepXML, etc files are located)
    • Pipeline bin directory (location where third-party tools (TPP, Xtandem, etc) are located
  • MS2 analysis tools will be run on a separate server.
  • A version of Java supported by your version of LabKey Server for each location that will be running tasks.
  • You have downloaded or built from source the following files:
    • LabKey Server
    • Labkey Server Enterprise Pipeline Configuration files

Prerequisites

In order to install the LabKey Enterprise Pipeline, you will first need to install and configure the following prerequisite software:

Enable Communication with the ActiveMQ JMS Queue

You will need to add the following settings to the LabKey configuration file. Before adding this section, check to see if it is already present but commented out, and if so simply uncomment it:

<Resource name="jms/ConnectionFactory" auth="Container"
type="org.apache.activemq.ActiveMQConnectionFactory"
factory="org.apache.activemq.jndi.JNDIReferenceFactory"
description="JMS Connection Factory"
brokerURL="tcp://@@JMSQUEUE@@:61616"
brokerName="LocalActiveMQBroker"/>

You will need to change the setting for brokerURL to point to the location of your ActiveMQ installation (i.e. replace @@JMSQUEUE@@ with the hostname of the server running the ActiveMQ software).

Set the Enterprise Pipeline Configuration Directory (Optional)

By default, the system looks for the pipeline configuration files in the following directory: LABKEY_HOME/config.

To specify a different location, add (or uncomment) the following parameter to the LabKey configuration file:

<Parameter name="org.labkey.api.pipeline.config" value="C:/path-to-config"/>

Set this to the location of your Enterprise Pipeline configuration directory.

Create the Enterprise Pipeline Configuration Files

  • Unzip the Enterprise Pipeline Configuration distribution and copy the webserver configuration file to the Pipeline Configuration directory specified in the last step (ie <LABKEY_HOME>/config).
  • The configuration file is ms2config.xml which includes:
    • Where MS2 searches will be performed (on a remote server or locally on the web server)
    • Where the Conversion of raw files to mzXML will occur (if required)
    • Which analysis tools will be executed during a MS2 search

Restart LabKey Server

In order for the LabKey Server to use the new Enterprise Pipeline configuration settings, the Tomcat process will need to be restarted. Confirm that the server started up with no errors:

  • Log on to your LabKey Server using a Site Admin account.
  • Select (Admin) > Site > Admin Console.
  • Under Diagnostics, click View all site errors.
  • Check to see that no errors have occurred after the restart.

Create the LABKEY_TOOLS Directory

The <LABKEY_TOOLS> directory will contain all the files necessary to perform MS2 searches on the remote server. The directory will contain:

  • Required LabKey software and configuration files
  • TPP tools
  • XTandem search engine
  • Additional MS2 analysis tools

Create the <LABKEY_TOOLS> Directory

Create the <LABKEY_TOOLS> directory on the remote server. It can be on the local file system, or available as a network file share.

Download the Required LabKey Software

  1. Unzip the LabKey Server distribution into the directory <LABKEY_TOOLS>/labkey/dist
  2. Unzip the LabKey Server Pipeline Configuration distribution into the directory <LABKEY_TOOLS>/labkey/dist/conf
NOTE: For the next section you will need to know the paths to the <LABKEY_TOOLS>/labkey directory and the <LABKEY_TOOLS>/external directory on the remote server.

Install the LabKey Software into the <LABKEY_TOOLS> Directory

Copy the following to the <LABKEY_TOOLS>/labkey directory

  • The directory <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/labkeywebapp
  • The directory <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/modules
  • The directory <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/pipeline-lib
  • The file <LABKEY_TOOLS>/labkey/dist/LabKey<VERSION>/tomcat-lib/labkeyBootstrap-X.Y.jar
Expand all modules in the <LABKEY_TOOLS>/labkey/modules directory by running:

cd <LABKEY_TOOLS>/labkey/
java -jar labkeyBootstrap-X.Y.jar

Create the Enterprise Pipeline Configuration Files

There are 2 configuration files used on the Remote Server:

  • pipelineConfig.xml
  • ms2Config.xml

Install the MS2 Analysis Tools

These tools will be installed in the <LABKEY_TOOLS>/bin directory on the Remote Server.

Test the Configuration

There are a few simple tests that can be performed at this stage to verify that the configuration is correct. These tests are focused on ensure that a remote server can perform an MS2 search

  1. Can the remote server see the Pipeline Directory and the <LabKey_Tools> directory?
  2. Can the remote server execute Xtandem?
  3. Can the remote server execute the Java binary?
  4. Can the remote server execute a Xtandem search against an mzXML file located in the Pipeline Directory?
  5. Can the remote server execute a PeptideProphet search against the resultant pepXML file?
  6. Can the remote server execute the Xtandem search again, but this time using the LabKey Java code located on the remote server?
Once all these tests are successful, you will have a working Enterprise Pipeline. The next step is to configure a new Project on your LabKey Server and configure the Project's pipeline to use the Enterprise Pipeline.

Related Topics




Configure the Conversion Service


This page explains how to configure the LabKey Server Enterprise Pipeline Conversion Service.

Assumptions

This documentation will describe how to configure the LabKey Server Enterprise Pipeline to convert native instrument data files (such as .RAW) files to mzXML using the msconvert software that is part of ProteoWizard.

  • The Conversion Server can be configured to convert from native acquisition files for a number of manufacturers.
  • Use of a Shared File System: The LabKey Conversion server must be able to mount the following resources
    • Pipeline directory (location where mzXML, pepXML, etc files are located)
  • A version of Java compatible with your LabKey Server version is installed
  • You have downloaded (or built from the Subversion source control system) the following files
    • LabKey Server
    • Labkey Server Enterprise Pipeline Configuration files

Download and Expand the LabKey Conversion Server Software

  1. Create the <LABKEY_HOME> directory (LabKey recommends you use C:\labkey\labkey)
  2. Unzip the LabKey Server distribution into the directory <LABKEY_HOME>\dist
  3. Unzip the LabKey Server Pipeline Configuration distribution into the directory <LABKEY_HOME>\dist

Install the LabKey Software

Copy the following to the <LABKEY_HOME> directory:

  • The directory <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-Bin\labkeywebapp
  • The directory <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-Bin\modules
  • The directory <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-Bin\pipeline-lib
  • The file <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-Bin\tomcat-lib\labkeyBootstrap.jar
Copy the following to the <LABKEY_HOME>\config directory:
  • All files in the directory <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-PipelineConfig\remote
Expand all modules in the <LABKEY_HOME>\modules directory by running the following from a Command Prompt:

cd <LABKEY_HOME>
java -jar labkeyBootstrap.jar

In the System Control Panel, create the LABKEY_ROOT environment variable and set it to <LABKEY_HOME>. This should be a System Variable.

Create the Tools Directory

This is the location where the Conversion tools (msconvert.exe, etc) binaries are located. For most installations this should be set to <LABKEY_HOME>\bin

  • Place the directory <LABKEY_HOME>bin directory on the PATH System Variable using the System Control Panel
  • Copy the conversion executable files to <LABKEY_HOME>bin

Edit the Enterprise Pipeline Configuration File (pipelineConfig.xml)

The pipelineConfig.xml file enables communication with: (1) the the JMS queue, (2) the RAW files to be converted, and (3) the tools that perform the conversion.

An example pipelineConfig.xml File

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd">

<bean id="activeMqConnectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory">
<constructor-arg value="tcp://localhost:61616"/>
<property name="userName" value="someUsername" />
<property name="password" value="somePassword" />
</bean>

<bean id="pipelineJobService" class="org.labkey.pipeline.api.PipelineJobServiceImpl">
<property name="workDirFactory">
<bean class="org.labkey.pipeline.api.WorkDirectoryRemote$Factory">
<!--<property name="lockDirectory" value="T:/tools/bin/syncp-locks"/>-->
<property name="cleanupOnStartup" value="true" />
<property name="tempDirectory" value="c:/temp/remoteTempDir" />
</bean>
</property>
<property name="remoteServerProperties">
<bean class="org.labkey.pipeline.api.properties.RemoteServerPropertiesImpl">
<property name="location" value="mzxmlconvert"/>
</bean>
</property>

<property name="appProperties">
<bean class="org.labkey.pipeline.api.properties.ApplicationPropertiesImpl">
<property name="networkDriveLetter" value="t" />
<property name="networkDrivePath" value="\\someServer\somePath" />
<property name="networkDriveUser" value="someUser" />
<property name="networkDrivePassword" value="somePassword" />

<property name="toolsDirectory" value="c:/labkey/build/deploy/bin" />
</bean>
</property>
</bean>
</beans>

Enable Communication with the JMS Queue

Edit the following lines in the <LABKEY_HOME>\config\pipelineConfig.xml

<bean id="activeMqConnectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory">
<constructor-arg value="tcp://@@JMSQUEUE@@:61616"/>
</bean>

and change @@JMSQUEUE@@ to be the name of your JMS Queue server.

Configure the WORK DIRECTORY

The WORK DIRECTORY is the directory on the server where RAW files are placed while be converted to mzXML. There are 3 properties that can be set

  • lockDirectory: This config property helps throttle the total number of network file operations running across all machines. Typically commented out.
  • cleanupOnStartup: This setting tells the Conversion server to delete all files in the WORK DIRECTORY at startup. This ensures that corrupted files are not used during conversion
  • tempDirectory: This is the location of the WORK DIRECTORY on the server
To set these variables edit the following lines in the <LABKEY_HOME>\config\pipelineConfig.xml

<property name="workDirFactory">
<bean class="org.labkey.pipeline.api.WorkDirectoryRemote$Factory">
<!-- <property name="lockDirectory" value="T:/tools/bin/syncp-locks"/> -->
<property name="cleanupOnStartup" value="true" />
<property name="tempDirectory" value="c:/TempDir" />
</bean>
</property>

Configure the Application Properties

There are 2 properties that must be set

  • toolsDirectory: This is the location where the Conversion tools (msconvert.exe, etc) are located. For most installations this should be set to <LABKEY_HOME>\bin
  • networkDrive settings: These settings specify the location of the shared network storage system. You will need to specify the appropriate drive letter, UNC PATH, username and password for the Conversion Server to mount the drive at startup.
To set these variables edit <LABKEY_HOME>\config\pipelineConfig.xml

Change all values surrounded by "@@...@@" to fit your environment:

  • @@networkDriveLetter@@ - Provide the letter name of the drive you are mapping to.
  • @@networkDrivePath@@ - Provide a server and path to the shared folder, for example: \\myServer\folderPath
  • @@networkDriveUser@@ and @@networkDrivePassword@@ - Provide the username and password of the shared folder.
  • @@toolsDirectory@@ - Provide the path to the bin directory, for example: C:\labkey\bin
<property name="appProperties">
<bean class="org.labkey.pipeline.api.properties.ApplicationPropertiesImpl">

<!-- If the user is mapping a drive, fill in this section with their input -->
<property name="networkDriveLetter" value="@@networkDriveLetter@@" />
<property name="networkDrivePath" value="@@networkDrivePath@@" />
<property name="networkDriveUser" value="@@networkDriveUser@@" />
<property name="networkDrivePassword" value="@@networkDrivePassword@@" />

<!-- Enter the bin directory, based on the install location -->
<property name="toolsDirectory" value="@@toolsDirectory@@" />
</bean>
</property>

Edit the Enterprise Pipeline MS2 Configuration File (ms2Config.xml)

The MS2 configuration settings are located in the file <LABKEY_HOME>\config\ms2Config.xml

An example configuration for running msconvert on a remote server named "mzxmlconvert":

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd">

<bean id="ms2PipelineOverrides" class="org.labkey.api.pipeline.TaskPipelineRegistrar">
<property name="factories">
<list>
<!-- This reference and its related bean below enable RAW to mzXML conversion -->
<ref bean="mzxmlConverterOverride"/>
</list>
</property>
</bean>

<!-- Enable Thermo RAW to mzXML conversion using msConvert. -->
<bean id="mzxmlConverterOverride" class="org.labkey.api.pipeline.cmd.ConvertTaskFactorySettings">
<constructor-arg value="mzxmlConverter"/>
<property name="cloneName" value="mzxmlConverter"/>
<property name="commands">
<list>
<ref bean="msConvertCommandOverride"/>
</list>
</property>
</bean>

<!-- Configuration to customize behavior of msConvert -->
<bean id="msConvertCommandOverride" class="org.labkey.api.pipeline.cmd.CommandTaskFactorySettings">
<constructor-arg value="msConvertCommand"/>
<property name="cloneName" value="msConvertCommand"/>
<!-- Run msconvert on a remote server named "mzxmlconvert" -->
<property name="location" value="mzxmlconvert"/>
</bean>
</beans>

Install the LabKey Remote Server as a Windows Service

LabKey uses procrun to run the Conversion Service as a Windows Service. This means you will be able to have the Conversion Service start up when the server boots and be able to control the Service via the Windows Service Control Panel.

Set the LABKEY_ROOT Environment Variable.

In the System Control Panel, create the LABKEY_ROOT environment variable and set it to <LABKEY_HOME> (where <LABKEY_HOME> is the target install directory). This should be a System Environment Variable.

Install the LabKey Remote Service

  • Copy *.exe and *.bat from the directory <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-PipelineConfig\remote to <LABKEY_HOME>\bin
  • For 32-bit Windows installations, install the service by running the following from the Command Prompt:
set LABKEY_ROOT=<LABKEY_HOME>
<LABKEY_HOME>\bin\installServiceWin32.bat
  • For 64-bit Windows installations, install the service by running the following from the Command Prompt:
set LABKEY_ROOT=<LABKEY_HOME>
<LABKEY_HOME>\bin\installServiceWin64.bat

where <LABKEY_HOME> is the directory where labkey is installed. For example, if installed in c:\labkey\labkey, then the command is

set LABKEY_ROOT=c:\labkey\labkey

If the command succeeded then it should have created a new Windows Service named LabKeyRemoteServer

How to Uninstall the LabKey Remote Pipeline Service

  • For 32-bit Windows installations, run the following:
<LABKEY_HOME>\bin\service\removeServiceWin32.bat
  • For 64-bit Windows installations, run the following:
<LABKEY_HOME>\bin\service\removeServiceWin64.bat

To Change the Service:

  • Uninstall the service as described above.
  • Reboot the server.
  • Edit <LABKEY_HOME>\bin\service\installServiceWin32.bat or <LABKEY_HOME>\bin\service\installServiceWin64.bat as appropriate, and make the necessary changes and run
<LABKEY_HOME>\bin\service\installService.bat

How to Manage the LabKey Remote Windows Service

How to start the service:

From the command prompt you can run

net start LabKeyRemoteServer

How to stop the service:

From the command prompt you can run

net start LabKeyRemoteServer

Where are the Log Files Located

All logs from the LabKey Remote Server are located in <LABKEY_HOME>\logs\output.log

NOTE: If running Windows, this service cannot be run as the Local System user. You will need to change the LabKey Remote Pipeline Service to log on as a different user.

Related Topics




Configure Remote Pipeline Server


This page explains how to configure the LabKey Server Enterprise Pipeline Remote Server. The Remote Server can be used to execute X!Tandem or SEQUEST MS/MS searches on a separate computer from LabKey Server. It can also be used run a raw data file to mzXML conversion server, or run other pipeline configured tools.

Assumptions

  • Use of a Shared File System: The LabKey Remote Server must be able to mount the following resources
    • Pipeline directory (location where mzXML, pepXML, etc files are located)
  • Java is installed. It must be a supported version for the version of LabKey Server that is being deployed, and match the version being used for the web server.
  • You have downloaded the version of LabKey Server you wish to deploy

Install the Enterprise Pipeline

Download and expand LabKey Server

NOTE: You will use the same distribution software for this server as you use for the LabKey Server web server. We recommend simply copying the downloaded distribution files from your LabKey Server.

  • Create the <LABKEY_HOME> directory
    • On Windows: LabKey recommends you use c:\LabKey
    • On Linux or OSX: LabKey recommends you use /usr/local/labkey
  • Unzip the LabKey Server distribution into the directory <LABKEY_HOME>\src

Install the LabKey Software

Copy the following to the <LABKEY_HOME> directory

  • The directory <LABKEY_HOME>\src\LabKeyX.X-xxxxx-Bin\labkeywebapp
  • The directory <LABKEY_HOME>\src\LabKeyX.X-xxxxx-Bin\modules
  • The directory <LABKEY_HOME>\src\LabKeyX.X-xxxxx-Bin\pipeline-lib
  • The directory <LABKEY_HOME>\src\LabKeyX.X-xxxxx-Bin\bin
  • The file <LABKEY_HOME>\src\LabKeyX.X-xxxxx-Bin\tomcat-lib\labkeyBootstrap.jar

Create the Configuration directory

  • Create the directory <LABKEY_HOME>\config

Create the Temporary Working directory for this server

  • Create the directory <LABKEY_HOME>\RemoteTempDirectory

Create the directory to hold the FASTA indexes for this server (for installations using Sequest)

  • Create the directory <LABKEY_HOME>\FastaIndices

Create the logs directory

  • Create the directory <LABKEY_HOME>\logs

Install the Pipeline Configuration Files

There are XML files that need to be configured on the Remote Server. The configuration settings will be different depending on the use of the LabKey Remote Pipeline Server.

Download the Enterprise Pipeline Configuration Files

Configuration Settings for using the Enhanced Sequest MS2 Pipeline

pipelineConfig.xml: This file holds the configuration for the pipeline. To install:

  1. Copy <LABKEY_HOME>\src\LabKeyX.X-xxxxx-PipelineConfig\remote\pipelineConfig.xml to <LABKEY_HOME>\config
  2. Copy <LABKEY_HOME>\src\LabKeyX.X-xxxxx-PipelineConfig\webserver\ms2Config.xml to <LABKEY_HOME>\config

There are a few important settings that may need to be changed:

  • tempDirectory: set to <LABKEY_HOME>\RemoteTempDirectory
  • toolsDirectory: set to <LABKEY_HOME>\bin
  • location: set to sequest
  • Network Drive Configuration: You will need to the set the variables in this section of the configuration. In order for the Enhanced SEQUEST MS2 Pipeline to function, the LabKey Remote Pipeline Server will need to be able to access same files as the LabKey Server via a network drive. The configuration below will allow the LabKey Remote Pipeline Server to create a new Network Drive.
    <property name="appProperties"> 
    <bean class="org.labkey.pipeline.api.properties.ApplicationPropertiesImpl">
    <property name="networkDriveLetter" value="t" />
    <property name="networkDrivePath" value="\\@@SERVER@@\@@SHARE@@" />
    <!-- Map the network drive manually in dev mode, or supply a user and password -->
    <property name="networkDriveUser" value="@@USER@@" />
    <property name="networkDrivePassword" value="@@PASSWORD@@" />
  • Enable Communication with the JMS Queue by changing @@JMSQUEUE@@ to be the name of your JMS Queue server in the code that looks like:
    <bean id="activeMqConnectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory"> 
    <constructor-arg value="tcp://@@JMSQUEUE@@:61616"/>
    </bean>
    • Change @@JMSQUEUE@@ to be the hostname of the server where you installed the ActiveMQ software.

ms2Config.xml: This file holds the configuration settings for MS2 searches. Change the configuration section:

<bean id="sequestTaskOverride" class="org.labkey.ms2.pipeline.sequest.SequestSearchTask$Factory"> 
<property name="location" value="sequest"/>
</bean>
to
<bean id="sequestTaskOverride" class="org.labkey.ms2.pipeline.sequest.SequestSearchTask$Factory"> 
<property name="sequestInstallDir" value="C:\Program Files (x86)\Thermo\Discoverer\Tools\Sequest"/>
<property name="indexRootDir" value="C:\FastaIndices"/>
<property name="location" value="sequest"/>
</bean>

Configuration Settings for executing X!Tandem searches on the LabKey Remote Pipeline Server

If you are attempting to enable this configuration, you may find assistance by searching the inactive Proteomics Discussion Board, or contact us on the Community Support Forum.

Install the LabKey Remote Server as a Windows Service

LabKey uses procrun to run the Remote Server as a Windows Service. This means you will be able to have the Remote Server start up when the server boots and be able to control the Service via the Windows Service Control Panel.

Set the LABKEY_ROOT environment variable

In the System Control Panel, create the LABKEY_ROOT environment variable and set it to <LABKEY_HOME> . This should be a System Environment Variable.

Install the LabKey Remote Service

This assumes that you are running a 64-bit version of Windows, have a 64-bit Java Virtual Machine.

  • Copy *.bat from the directory <LABKEY_HOME>\dist\LabKeyX.X-xxxxx-PipelineConfig\remote to <LABKEY_HOME>\bin
  • Download the latest version of the Apache Commons Daemon to <LABKEY_HOME>\dist
  • Expand the downloaded software
  • Copy the following from the expanded directory to <LABKEY_HOME>\bin
    • prunmgr.exe to <LABKEY_HOME>\bin\prunmgr.exe
    • amd64\prunsrv.exe to <LABKEY_HOME>\bin\prunsrv.exe
    • amd64\prunsrv.exe to <LABKEY_HOME>\bin\procrun.exe
  • Install the Windows Service by running the following from the Command Prompt:
    set LABKEY_ROOT=<LABKEY_HOME> 
    <LABKEY_HOME>\bin\service\installService.bat
where <LABKEY_HOME> is the directory where labkey is installed. For example, if installed in c:\labkey, then the command is
set LABKEY_ROOT=c:\labkey

If the installService.bat command succeeded, it should have created a new Windows Service named LabKeyRemoteServer.

Starting and Stopping the LabKey Remote Windows Service

To start the service, from a command prompt, run:

net start LabKeyRemoteServer

To stop the service, from a command prompt, run:

net stop LabKeyRemoteServer

Log File Locations

All logs from the LabKey Remote Server are located in <LABKEY_HOME>\logs

Related Topics




Use the Enterprise Pipeline


This topic explains how to configure a project to use the Enterprise Pipeline. This example configures it for MS2 searches. For these instructions, we will create a new project and configure a pipeline for it.

Prerequisites: Before performing the tasks below, you must:

Create a New Project to Test the Enterprise Pipeline

You can skip this step if a project already exists that you would rather use.

  • Log on to your LabKey Server using a Site Admin account
  • Create a new project with the following options:
    • Project Name: PipelineTest
    • Folder Type: MS2
  • Accept the default settings during the remaining wizard panels

For more information see Create a Project or Folder.

Configure the Project to use the Enterprise Pipeline

The following information will be required in order to configure this project to use the Enterprise Pipeline:

  • Pipeline Root Directory

Set Up the Pipeline

  • In the Data Pipeline web part, click Setup.
  • Enter the following information:
    • Path to the desired pipeline root directory on the web server
    • Specific settings and parameters for the relevant sections
  • Click Save.
  • Return to the MS2 Dashboard by clicking the PipelineTest link near the top of the page.

Run the Enterprise Pipeline

To test the Enterprise Pipeline:

  • In the Data Pipeline web part, click Process and Upload Data.
  • Navigate to and select an mzXML file, then click X!Tandem Pepitide Search.

Most jobs are configured to run single-threaded. The pipeline assigns work to different thread pools. There are two main ones for work that runs on the web server, each with one thread in it. The pipeline can be configured to run with more threads or additional thread pools if necessary. In many scenarios, trying to run multiple imports in parallel or some third-party tools in parallel degrades performance vs running them sequentially.

Related Topics




Troubleshoot the Enterprise Pipeline


This topic covers some general information about monitoring, maintaining, and troubleshooting the Enterprise Pipeline.

Determine Which Jobs and Tasks Are Actively Running

Each job in the pipeline is composed of one or more tasks. These tasks are assigned to run at a particular location. Locations might include the web server and one or more remotes server for RAW to mzXML conversion, and other tools. Each location may have one or more worker threads that runs the tasks. A typical installation might have the following locations that run the specified tasks:

Location# of threadsTasks
Web Server1CHECK FASTA IMPORT RESULTS
Web Server, high priority1MOVE RUNS
Conversion server1+MZXML CONVERSION
Other remote server1+SEARCH ANALYSIS

When jobs are submitted, the first task in the pipeline will be added to the queue in the WAITING (SEARCH WAITING, for example) state. As soon as there is a worker thread available, it will take the job from the queue and change the state to RUNNING. When it is done, it will put the task back on the queue in the COMPLETE state. The web server should immediately advance the job to the next task and put it back in the queue in the WAITING state.

If jobs remain in an intermediate COMPLETE state for more than a few seconds, there is something wrong and the pipeline is not properly advancing the jobs.

Similarly, if there are jobs in the WAITING state for any of the locations, and no jobs in the RUNNING state for those locations, something is wrong and the pipeline is not properly running the jobs.

Troubleshooting Stuck Jobs

Waiting for Other Jobs to Complete

If jobs are sitting in a WAITING state, other jobs may be running, perhaps in other folders. Check to see if any others jobs are running via the Pipeline link on the Admin Console and filtering the list of jobs. If other jobs are running, your job may simply be waiting in the queue.

ActiveMQ/JMS Connection Lost

If no jobs are actively running, the server may have lost connectivity with the rest of the system. Check the labkey.log file for errors. A message like this (abbreviated for readability):

org.mule.providers.FatalConnectException: ReconnectStrategy "org.mule.providers.SimpleRetryConnectionStrategy" failed to reconnect receiver on endpoint "ActiveMqJmsConnector{this=4973ffee, started=false, initialised=false, name='jmsConnectorFastaCheckWork', disposed=true, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}"
at org.mule.providers.SimpleRetryConnectionStrategy.doConnect(SimpleRetryConnectionStrategy.java:130)
...
Caused by: org.mule.providers.ConnectException: Initialisation Failure: Could not connect to broker URL: tcp://ActiveMQServer:61616. Reason: java.net.ConnectException: Connection refused: connect
at org.mule.providers.jms.JmsConnector.doConnect(JmsConnector.java:381)
...
Caused by: javax.jms.JMSException: Could not connect to broker URL: tcp://ActiveMQServer:61616. Reason: java.net.ConnectException: Connection refused: connect
...
Caused by: java.net.ConnectException: Connection refused: connect
at java.base/java.net.PlainSocketImpl.waitForConnect(Native Method)
...

indicates that the server has lost its connection to ActiveMQ. Ensure that ActiveMQ is still running. While LabKey Server will automatically try to reestablish the connection, in some cases you may need to shut down LabKey Server, restart ActiveMQ, and then start LabKey Server again to completely restore the connectivity.

Remote Pipeline Server Connectivity Lost

If the primary LabKey Server is having no problems and you are using a Remote Pipeline Server, it may have lost its ActiveMQ connection or encountered other errors. Check its labkey.log file for possible information.

Try restarting using the following sequence:

  1. Delete or cancel the waiting jobs through the Admin Console
  2. Shut down the remote server
  3. Shut down Tomcat
  4. Shut down ActiveMQ
  5. Restart ActiveMQ
  6. Restart Tomcat
  7. Restart the remote server
  8. Submit a new job

Resetting ActiveMQ's Storage

If the steps above do not resolve the issue, try resetting ActiveMQ's state. Follow the steps above, but between steps 4 and 5 (after shutting down ActiveMQ and before restarting it) add these steps:
  • Go to the directory where ActiveMQ is installed
  • Rename the .\data directory to .\data_backup
and continue the rest of the steps to restart the services and try submitting a job. Assuming this solves the problem, you can safely delete the data_backup directory afterwards.

Reload User Undefined

If you see an error message similar to the following:

26 Jun 2018 02:00:00,071 ERROR: The specified reload user is invalid

Consider whether any user accounts responsible for reload pipeline jobs may have been deleted, instead of deactivated.

To resolve errors of this kind, locate any automated reload that may have been defined and deactivate it. Reactivate the reload with a new, valid user account.




File Transfer Module / Globus File Sharing


The File Transfer module requires significant customization by a developer and is not included in standard LabKey distributions. Developers can find the source code in the LabKey GitHub repository. Please contact LabKey to inquire about support options.

The File Transfer module provides a way to interact with large files not actually stored on LabKey Server, but on a Globus endpoint. Users can view and initiate downloads of the Globus-based files using the LabKey Server user interface. The LabKey Server interface provides basic metadata about each file, and whether the file is available for download.

Set Up Globus

This step configures Globus to allow LabKey Server to obtain Globus authentication codes, which can be traded for access tokens with which LabKey Server can initiate transfers.

urn:globus:auth:scope:transfer.api.globus.org:all
  • Redirects: You must specify a redirect URL that the application will use when making authorization and token requests. These redirect URLs are required and must match the URLs provided in the requests. This should be set to the filetransfer-tokens.view URL for your server, without any project/folder container specified (see the example below). The ending ? is required. Even though Globus says https required, this is not true when using localhost.

  • Once registered, Globus will assign the your application a Client ID (e.g., ebfake35-314f-4f51-918f-12example87).
  • You can then generate a Client Secret (e.g., r245yE6nmL_something_like_this_IRmIF=) to be used when making authorization requests.
  • When you create an endpoint with Globus, it will be assigned a unique Endpoint ID, which is used in transfer requests.
  • Retain these three values so they can be used in configuring the FileTransfer module within LabKey:
    • Client ID
    • Client Secret
    • Endpoint ID

Set Up LabKey Server

Configuration

  • Ensure that the File Transfer module is deployed on your server. If you are unsure, ask your server administrator.
  • In some folder, enable the module FileTransfer.
  • Go to (Admin) > Site > Admin Console.
  • Under Premium Features, click File Transfer.
  • Enter the Client Id and Client Secret captured above.
  • Specify the various Service URLs for the Globus API. The help text that appears when you hover over the question marks at the end of the field labels provide the URLs that can be used.
  • Specify the Endpoint Id for the transfer source and Endpoint Name (the display name) for this endpoint.
  • Specify the File Transfer Root Directory where this endpoint is mounted on the file system. Directories used by the individual web parts will be specified relative to this directory.

Create the Metadata List

  • Create the metadata list to be used as the user interface for the File Transfer web part. This list should contain a 'file name' field to hold the name of the file that is being advertised as available. Add any other fields that you want to describe these available files. Note that all of the data in this List must be maintained manually, except for the Available field, which is added and updated automatically by the File Transfer module, depending on the metadata it pulls from the mounted directory. If a file is present in the configured Endpoint directory, the Available field will show "Yes", otherwise it will show "No".

Set Up LabKey Server User Interface

This step configures LabKey Server to capture the metadata received from the Globus configuration.

  • On the File Transfer: Customize page, enter the following fields:
    • Web Part Title: The display title for the web part.
    • Local Directory: The path relative to the File Transfer Root Directory specified in the configuration step above. That root directory is where the Globus Endpoint files are mounted for this project. This relative subdirectory will be used to determine the value for the "Available" column in the List you just created.
    • Reference List: This points to the List, and it's 'file name' field, that you just created.
    • Endpoint Directory: The directory on the Globus Endpoint containing the files.
  • Once all of the above parts are configured, the Transfer link will appear on the File Transfer menu bar. Once you select at least one file, the link will be enabled.

Set Up for Testing

When testing the file transfer module on a development server, you may need to make the following configuration changes:

  • Configure Tomcat on your development server to accept https connections.
  • Configure the base URL in LabKey site settings to start with "https://". The base URL is used to generate absolute URLs, such as the redirect URL passed to Globus.
  • Make the development server accessible to the Internet, so Globus can validate the URL.



S3 Cloud Data Storage


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Also available as an Add-on to the Starter Edition. Learn more or contact LabKey.

LabKey Server can integrate cloud storage for management of large data files using Amazon S3 (Simple Storage Service). Support for other storage providers will be considered in the future. For more information about this feature and possible future directions, please contact LabKey.

Cloud storage services are best suited to providing an archive for large files. Files managed by cloud storage can be exposed within LabKey as if they were in the local filesystem. You can add and remove files from the cloud via the LabKey interface and many pipeline jobs can be run directly on them. This section outlines the steps necessary to access data in an S3 bucket from the Files web part on your server.

Cloud Data Storage Overview

Cloud Services offer the ability to upload and post large data files in the cloud, and LabKey Server can interface with this data allowing users to integrate it smoothly with other data for seamless use by LabKey analysis tools. In order to use these features, you must have installed the cloud module in your LabKey Server.

Contact your Account Manager for assistance if needed.

Buckets

Cloud Storage services store data in buckets which are typically limited to a certain number by user account, but can contain unlimited files. LabKey Server Cloud Storage uses a single bucket with a directory providing a pseudo-hierarchy so that multiple structured folders can appear as a multi-bucket storage system.

Learn more about Amazon S3 Buckets here: Working with Amazon S3 Buckets

Encryption and Security

The Cloud module supports AWS S3 "default" AES encryption. This can be configured when the bucket is provisioned. With "default" encryption S3 transparently encrypts/decrypts the files/objects when they are written to or read from the S3 bucket.

AWS S3 also supports unique KMS (Key Management System) encryption keys that are managed by the customer within AWS. Other security controls on the S3 bucket such as who can access those files via other AWS methods is in full control of the customer by using AWS IAM Policies.

Regardless of how the S3 buckets are configured, LabKey has a full authentication and authorization system built in to manage who can access those files within LabKey.

Set Up Cloud Storage

To set up to use Cloud Storage with LabKey Server, you need the targets on AWS/S3, and configuration at the site-level before you'll be able to use the storage.

  1. Create the necessary AWS indentity credential.
  2. Each bucket on S3 that you plan to access must be created on AWS first.
  3. On LabKey Server, create a Cloud Account for accessing your buckets.
  4. Each bucket is then defined as a Storage Config on LabKey Server, using that account.
These Storage Configs will then be available in your projects and folders as Cloud Stores.

Use Files from Cloud Storage

Once you have completed the setup steps above, you will be able to select which Cloud Stores to use on a per-folder basis as required. The enabling and usage are covered in this topic:

For example, once correctly configured, you will be able to "surface" files from your S3 bucket in a Files web part on your LabKey Server.

Related Topics

What's Next?

If you are interested in learning more about, or contributing to, the future directions for this functionality, please contact LabKey.




AWS Identity Credentials


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Also available as an Add-on to the Starter Edition. Learn more or contact LabKey.

The identity and credential LabKey will use to access your S3 bucket are generated by creating an AWS Identity .

AWS Identity Credentials

On the AWS console click "Add User", provide a user name, select Programmatic Access, create a new group and give it AdministratorAccess. If AdministratorAccess is not possible, the detailed permissions required are listed below.

At the end of the AWS setup wizard, you will be given an "Access key id" and a "Secret access key". Enter these in the Identity and Credentials fields when you create a Cloud Account on LabKey Server.

S3 Permissions Required

The detailed permissions required for S3 access are listed below. Substitute your bucket name where you see BUCKET_NAME.

{ 
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Action":[
"s3:GetAccountPublicAccessBlock",
"s3:GetBucketAcl",
"s3:GetBucketLocation",
"s3:GetBucketPolicyStatus",
"s3:GetBucketPublicAccessBlock",
"s3:HeadBucket",
"s3:ListAllMyBuckets"
],
"Resource":"*"
},
{
"Effect":"Allow",
"Action":[
"s3:GetLifecycleConfiguration",
"s3:GetBucketTagging",
"s3:GetInventoryConfiguration",
"s3:GetObjectVersionTagging",
"s3:ListBucketVersions",
"s3:GetBucketLogging",
"s3:ReplicateTags",
"s3:ListBucket",
"s3:GetAccelerateConfiguration",
"s3:GetBucketPolicy",
"s3:ReplicateObject",
"s3:GetObjectVersionTorrent",
"s3:GetObjectAcl",
"s3:GetEncryptionConfiguration",
"s3:GetBucketObjectLockConfiguration",
"s3:AbortMultipartUpload",
"s3:PutBucketTagging",
"s3:GetBucketRequestPayment",
"s3:GetObjectVersionAcl",
"s3:GetObjectTagging",
"s3:GetMetricsConfiguration",
"s3:PutObjectTagging",
"s3:DeleteObject",
"s3:DeleteObjectTagging",
"s3:GetBucketPublicAccessBlock",
"s3:GetBucketPolicyStatus",
"s3:ListBucketMultipartUploads",
"s3:GetObjectRetention",
"s3:GetBucketWebsite",
"s3:PutObjectVersionTagging",
"s3:PutObjectLegalHold",
"s3:DeleteObjectVersionTagging",
"s3:GetBucketVersioning",
"s3:GetBucketAcl",
"s3:GetObjectLegalHold",
"s3:GetReplicationConfiguration",
"s3:ListMultipartUploadParts",
"s3:PutObject",
"s3:GetObject",
"s3:GetObjectTorrent",
"s3:PutObjectRetention",
"s3:GetBucketCORS",
"s3:GetAnalyticsConfiguration",
"s3:GetObjectVersionForReplication",
"s3:GetBucketLocation",
"s3:ReplicateDelete",
"s3:GetObjectVersion"
],
"Resource":[
"arn:aws:s3:::BUCKET_NAME",
"arn:aws:s3:::BUCKET_NAME/*"
]
}
]
}

Additionally, if ACLs are defined on individual objects within a bucket, the user will need READ and READ_ACP permission to each object for read-only usage, and WRITE and WRITE_ACP for write usage.

See more information about S3 permissions in the AWS documentation.

Related Topics




Configure Cloud Storage


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Also available as an Add-on to the Starter Edition. Learn more or contact LabKey.

This topic outlines how to create and configure S3 cloud storage on LabKey Server. Each bucket on S3 that you plan to access will be defined as a Storage Config on LabKey Server, accessed through a Cloud Account. You will then be able to select which storage config to use on a per-folder basis.

Configure LabKey Server to use Cloud Storage

Create Bucket (on AWS)

Before you can use your Cloud Storage account from within LabKey Server, you must first create the bucket you intend to use and the user account must have "list" as well as "upload/delete" permissions on the bucket.

  • It is possible to have multiple cloud store services per account.
  • AWS S3 "default" AES encryption is supported and can be configured on the S3 bucket when the bucket is provisioned.
    • With "default" encryption S3 transparently encrypts/decrypts the files/objects when they are written to or read from the S3 bucket.
  • AWS S3 also supports unique KMS (Key Management System) encryption keys that are managed by the customer within AWS.

Create Cloud Account On LabKey Server

To access the bucket, you create a cloud account on your server, providing a named way to indicate the cloud credentials to use.

  • Select (Admin) > Site > Admin Console.
  • Under Premium Features, click Cloud Settings.
    • If you do not see this option, you do not have the cloud module installed.
  • Under Cloud Accounts, click Create Account.
    • Enter an Account Name. It must be unique and will represent the login information entered here.
    • Select a Provider.
    • Enter your Identity and Credential. See AWS Identity for details.
  • Click Create.

This feature uses the encrypted property store for credentials and requires an administrator to provide a master encryption key in the labkey.xml file. LabKey will refuse to store credentials if a key is not provided. For instructions, see: Installation: LabKey Configuration File.

Create Storage Config (on LabKey Server)

Next define a Storage Config, effectively a file alias pointing to a bucket available to your account. LabKey can create new subfolders in that location, or if you want to use a pre-existing S3 subdirectory within your bucket, you can specify it using the S3 Path option.

  • Click Create Storage Config on the cloud account settings page under Cloud Store Service.
    • If you navigated away, select (Admin) > Site > Admin Console. Under Premium Features, click Cloud Settings.
  • Provide a Config Name. This name must be unique and it is good practice to base it on the S3 bucket that it will access.
  • Select the Account you just created from the pulldown.
  • Provide the S3 Bucket name itself. Do not include "S3://" or other elements of the full URL with the bucket name in this field. Learn more about bucket naming rules here
  • Select Enabled.
    • If you disable a storage config by unchecking this box, it will not be deleted, but you will be unable to use it from any container until enabling it again.
  • S3 Path: (Optional) You can specify a path within the S3 bucket that will be the configuration root of any LabKey folder using this configuration. This enables use of an existing folder within the S3 bucket. If no path is specified, the root is the bucket itself.
  • Directory Prefix: (Optional) Select whether to create a directory named <prefix><id> in the bucket or S3 path provided for this folder. The default prefix is "container".
    • If you check the Directory Prefix box (default), LabKey will automatically create a subdirectory in the configuration root (the bucket itself or the S3 path provided above) for each LabKey folder using this configuration. For example, a generated directory name would be "container16", where 16 is the id number of the LabKey folder. You can see the id number for a given folder/container by going to Folder > Management > Information, or by querying the core.Containers table through the UI or an API. You may also find the reporting in Admin Console > Files helpful, as it will let you navigate the container tree and see the S3 URLs including the containerX values. Note that using this option means that the subdirectory and its contents will be deleted if the LabKey folder is deleted.
    • If you do not check the box, all LabKey folders using this configuration will share the root location and LabKey will not delete the root contents when any folder is deleted.
  • SQS Queue URL: If your bucket is configured to queue notifications, provide the URL here. Note that the region (like "us-west-1" in this URL) must match the region for the S3 Bucket specified for this storage config.
  • Click Create.

Authorized administrators will be able to use the Edit link for defined storage configs for updating them.

Configure Queue Notifications for File Watchers

If you plan to use file watchers for files uploaded to your bucket, you must first configure the Simple Queue Service within AWS. Then supply the SQS Queue URL in your Storage Config on LabKey Server.

Learn more in this topic:

Related Topics




Use Files from Cloud Storage


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Also available as an Add-on to the Starter Edition. Learn more or contact LabKey.

Once you have configured S3 cloud storage on your LabKey Server, you can enable and use it as described in this topic.

Enable Cloud Storage in Folders and Projects

In each folder or project where you want to access cloud data, configure the filesystem to use the appropriate cloud storage config(s) you defined. Cloud storage at the project level can be inherited by folders within that project, or folders can override a project setting as needed.

Note that if a cloud storage config is disabled at the site-level it will not be possible to enable it within a folder or project.

  • Navigate to the folder where you want to enable cloud storage and open (Admin) > Folder > Management.
    • If you want to enable cloud storage at the project level, open (Admin) > Folder > Project Settings instead.
  • Select the Files tab.
  • Under Cloud Stores, the lower section of the page, first enable the desired cloud stores using the checkboxes.
    • Note that it's possible to disable a cloud storage config at the site level. If a config is not enabled at the site level, enabling it in a folder or project will have no effect.
  • Click Save.
  • After saving the selection of cloud stores for this container, you will be able to select one in the file root section higher on this page.
  • Under File Root select Use cloud-based file storage and use the dropdown to select the desired cloud store.
    • If you select this option before enabling the cloud store, you will see an empty dropdown.

  • Existing Files: When you select a new file root for a folder, you will see the option Proposed File Root change from '<prior option>'. Select what you want to happen to any existing files in the root. Note that if you are not using directory containers, you will not be able to move files as they will not be deleted from the shared root. See Migrate Existing Files for details about file migration options.
  • Click the Save button a second time.

Configure a Files Web Part

Once you have configured the file root for your folder to point to the cloud storage, you will access it using the usual Files web part.

  • Go to the Files web part in your folder.
    • If you don't have one, select (Admin) > Page Admin Mode.
    • From the dropdown in the lower left, click <Select Web Part> and choose Files.
    • Click Add.
  • If desired, you can give the web part a name that signals it will access cloud storage:
    • To do so, choose Customize from the (triangle) menu in the Files web part header.
    • Enter the Title you would like, such as "Cloud Files".
    • Checking the box to make the Folder Tree visible may also be helpful.
  • You can also give the webpart a descriptive title while you are in the customize interface.
  • Click Submit.

The Files web part will now display the cloud storage files as if they are part of your local filesystem, as in the case of the .fcs file shown here:

The file is actually located in cloud storage as shown here:

  • When a download request for a cloud storage file comes through LabKey Server, the handle is passed to the client so the client can download the file directly.
  • When a file is dropped into the Files web part, it will be uploaded to the cloud bucket.
  • Files uploaded to the S3 bucket independently of LabKey will also appear in the LabKey Files web part.

Run Pipeline Tasks on S3 Files

Once you have configured an S3 pipeline root, folders can be imported or reloaded from folder archives that reside in cloud storage.

  • Import folder (from either .folder.zip archive, or folder.xml file)

Use File Watchers with Cloud Files

To use File Watchers, you will need to enable SQS Queue Notifications on your AWS bucket, then include that queue in the storage config. Learn more in this topic:

Once configured, you can Reload Lists Using Data File from the cloud. Other file watcher types will be added in future releases.

Delete Files from Cloud Storage

If you have configured cloud storage in LabKey to create a new subdirectory (using Directory Prefix) for each new folder created on LabKey, the files placed within it will be associated with the LabKey folder. If you delete the LabKey folder, the associated subfolder of your S3 bucket (and all files within it) will also be deleted.

If you instead configured cloud storage to use an existing S3 folder, any files placed there will be visible from within the LabKey folder, but will NOT be deleted if the LabKey folder is deleted.

Related Topics




Cloud Storage for File Watchers


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Also available as an Add-on to the Starter Edition. Learn more or contact LabKey.

If you plan to use file watchers for files uploaded to your bucket, you need to supply an SQS Queue URL in your Storage Config on LabKey Server. This topic describes how to configure the bucket and queue on AWS to support this.

You will be configuring an SQS Queue that LabKey Server can 'watch' for notifications, then setting up event notifications on the bucket that will add to that queue when certain files are added to certain bucket locations.

Create SQS Queue on AWS

First, you must configure a Simple Queue Service within AWS to which your bucket will be able to post notifications.

If you have not already set up your bucket, follow the instructions in this topic: Configure Cloud Storage

Take note of the region (like "us-west-2") for your bucket.

  • In your AWS account, click Create Queue.
  • Select Standard queue. (FIFO queues are not supported).
  • Accept the default configuration options.
  • Set the Access Policy for the queue to Advanced and provide for both of the following:
    • Allow users and IAM roles to manipulate the queue/messages (create, read, update, delete). The example below allows full control. This is what will allow the credentials supplied to LabKey Server to read/delete the messages from the Queue.
    • Allow S3 to send messages to queue.
  • Create the queue.

Take note of the SQS Queue URL that you will use in the Storage Config in the next section.

Access Policy Example

Replace the items in brackets "<>" as appropriate for your deployment.

{
"Version": "2008-10-17",
"Id": "__default_policy_ID",
"Statement": [
{
"Sid": "__owner_statement",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<AccountNumber>:<role>"
},
"Action": [
"SQS:*"
],
"Resource": "arn:aws:sqs:::<QueueName>"
}, {
"Sid": "allow-bucket",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "SQS:SendMessage",
"Resource": "arn:aws:sqs:::<QueueName>",
"Condition": {
"StringEquals": {
"aws:SourceAccount": "<AccountNumber>"
},
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:*:*:*"
}
}
]
}

SQS Queue URL in Storage Config

Once you've enabled the queue on AWS, create or edit the LabKey Storage Config for this bucket, providing the SQS Queue URL.

Note that the region (like "us-west-2" in this URL) must match the region for the S3 Bucket specified in this storage config.

Configure S3 Event Notifications

Next, configure your bucket to send event notifications to your SQS Queue.

  • Log in to AWS and access the bucket you intend to monitor.
  • On the bucket's Properties tab, click Create Event Notification.
  • Under General configuration, name the event (i.e. MyFileDrop) and provide optional prefixes and suffixes to define which bucket contents should trigger notifications. Only objects with both the specified prefix and suffix will trigger notifications.
    • Prefix: This is typically the directory path within the bucket, and will depend on how you organize your buckets. For example, you might be using a "cloud_storage" main directory and then employ a LabKey directory prefix like "lk_" for each LabKey folder, then a "watched" subdirectory there. In this case the prefix would be something like "/cloud_storage/lk_2483/watched/". This prefix should either match, or be a subdirectory of, the Location to Watch you will set in your file watcher.
    • Suffix: A suffix for the files that should trigger notifications, such as ".list.tsv" if you want to look for files like "FirstList.list.tsv" and "SecondList.list.tsv" in the container specified in the prefix.
  • Event Types:
    • Select All Object Create Events.
  • Destination:
    • Select SQS Queue, then either:
      • Choose from your SQS Queues to use a dropdown selector or
      • Enter SQS queue ARN to enter it directly.
  • Click Save when finished.

Now when files in your bucket's Prefix location have the Suffix you defined, an event notification will be triggered on the queue specified. LabKey is "monitoring" this queue via the Storage Config, so now you will be able to enable file watchers on those locations.

Note that the files in your bucket that trigger notifications do not have to be the same as the files your file watcher will act upon. For example, you might want to reload a set of list files, but only trigger that file watcher when a master manifest was added. In such a scenario, your event notifications would be triggered by the manifest, and the filewatcher would then act on the files matching it's definition, i.e. the lists.

Troubleshooting Event Notifications

If you have multiple event notifications defined on your queue, note that they will be validated as a group. If you have existing notifications configured when you add a new one for your LabKey file watcher, you will not be able to save or edit new ones until you resolve the configuration of the prior ones. When in a state that cannot be validated, notifications will still be sent, but this is an interaction to keep in mind if you encounter problems defining a new one.

Create File Watcher for Cloud Files

Once cloud storage with queue notifications is configured, and enabled in your folder, you will be able to create a Files web part in the folder that "surfaces" the S3 location. For example, you can drop files into the bucket from the LabKey interface, and have them appear in the S3 bucket, or vice versa.

You will also be able to configure file watchers in that folder triggered by the event notifications on the bucket.

The Reload Lists Using Data File task is currently supported for this feature. Other file watcher types will support S3 cloud loading in future releases.

Related Topics