File Watcher Examples

2024-05-09

Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic includes some examples of using File Watchers.

Example: Dataset Creation and Import
Example: Named Capture Groups
Example: Capture the Dataset Name
Example: Capture both the Folder Destination and the Dataset Name
Example: Using the Parameter Function

Example: Dataset Creation and Import

Suppose you want to create a set of datasets based on Excel and TSV files, and load data into those datasets. To set this up, do the following:

In the File web part create a directory named 'watched'. (It is important that you do this before saving the File Watcher configuration.)
Prepare your Excel/TSV files to match the expectations of your study, especially, time point-style (date or visit), ParticipantId column name, and time column name. The name of your file should not include any numbers, only letters.
Upload the file into the study's File Repository.
Create a trigger to Import/reload study datasets using data file.
Location to Watch: enter 'watched'.
File Pattern: Leave blank. The default file pattern will be used, which is (^\D*)\.(?:tsv|txt|xls|xlsx) Note that this file pattern will not match file names which include numbers.
When the trigger is enabled, datasets will be created and loaded in your study.

Field	Value
Name	Load MyStudy
Description	Imports datasets to MyStudy
Type	Pipeline File Watcher
Pipeline Task	Import/reload study datasets using data file.
Location to Watch	watched
File Pattern
Move to container
Move to subdirectory

Example: Named Capture Group <study>

Consider a set of data with original filenames matching a format like this:

sample_2017-09-06_study20.tsv
sample_2017-09-06_study21.tsv
sample_2017-09-06_study22.tsv

An example filePattern regular expression that would capture such filenames would be:

sample_(.+)_(?<study>.+)\.tsv

Files that match the pattern are acted upon, such as being moved and/or imported to tables in the server. Nothing happens to files that do not match the pattern.

If the regular expression contains named capturing groups, such as the "(?<study>.+)" portion in the example above, then the corresponding value (in this example "study20") can be substituted into other property expressions. For instance, a Move to container setting of:

/studies/${study}/@pipeline/import/${now:date}

would resolve into:

/studies/study20/@pipeline/import/2017-11-07 (or similar)

This substitution allows the administrator to determine the destination folder based on the name, ensuring that the data is uploaded to the correct location.

Field	Value
Name	Load StudyA
Description	Moves and imports datasets to StudyA
Type	Pipeline File Watcher
Pipeline Task	Import/reload study datasets using data file.
Location	.
File Pattern	sample_(.+)_(?<study>.+)\.tsv
Move to container	/studies/${study}/@pipeline/import/${now:date}

Example: Capture the Dataset Name

A File Watcher that matches .tsv/.xls files with "StudyA_" prefixed to the file name. For example, "StudyA_LabResults.tsv". Files are moved, and the data imported, to the StudyA folder. The <name> capture group determines the name of the dataset, so that "StudyA_LabResults.tsv" becomes the dataset "LabResults".

Field	Value
Name	Load StudyA
Description	Moves and imports datasets to StudyA
Type	Pipeline File Watcher
Pipeline Task	Import/reload study datasets using data file.
Location	.
File Pattern	StudyA_(?<name>.+)\.(?:tsv\|xls)
Move to container	StudyA
Move to subdirectory	imported

Example: Capture both the Folder Destination and the Dataset Name

To distribute files like the following to different study folders:

StudyA_Demographics.tsv
StudyB_Demographics.tsv
StudyA_LabResults.tsv
StudyB_LabResults.tsv

Field	Value
Name	Load datasets
Location	watched
File Pattern	(?<study>.+)_(?<name>.+)\.tsv
Move to container	${study}
Move to subdirectory	imported

Example: Using the Parameter Function

The Parameter Function is a JavaScript function which is executed during the move. In the example below, the username is selected programmatically:

var userName = sourcePath.getNameCount() > 0 ? sourcePath.getName(0) : null;
var ret = {'pipeline, username': userName }; ret;