This topic includes some examples of using
File Watchers.
Example: Dataset Creation and Import
Suppose you want to create a set of datasets based on Excel and TSV files, and load data into those datasets. To set this up, do the following:
- In the File web part create a directory named 'watched'. (It is important that you do this before saving the File Watcher configuration.)
- Prepare your Excel/TSV files to match the expectations of your study, especially, time point-style (date or visit), ParticipantId column name, and time column name. The name of your file should not include any numbers, only letters.
- Upload the file into the study's File Repository.
- Create a trigger to Import/reload study datasets using data file.
- Location to Watch: enter 'watched'.
- File Pattern: Leave blank. The default file pattern will be used, which is (^\D*)\.(?:tsv|txt|xls|xlsx) Note that this file pattern will not match file names which include numbers.
- When the trigger is enabled, datasets will be created and loaded in your study.
Field | Value |
---|
Name | Load MyStudy |
Description | Imports datasets to MyStudy |
Type | Pipeline File Watcher |
Pipeline Task | Import/reload study datasets using data file. |
Location to Watch | watched |
File Pattern | |
Move to container | |
Move to subdirectory | |
Example: Name Capture Group for Substitution
You can use a
name capture group, to turn part of the file name into a token you can use for substitutions like the name of a target dataset or container. In this example, we use "study" but you could equally use "folder" if the folder name you wanted was not a study. Other substitition tokens can be created with any name you choose following the same method.
Consider a set of data with original filenames matching a format like this:
sample_2017-09-06_study20.tsv
sample_2017-09-06_study21.tsv
sample_2017-09-06_study22.tsv
An example
filePattern regular expression that would match such filenames and "capture" the "study20, study21, study22" portions would be:
sample_(.+)_(?<study>.+)\.tsv
Files that match the pattern are acted upon, such as being moved and/or imported to tables in the server. Nothing happens to files that do not match the pattern.
If the regular expression contains
name capture groups, such as the "(?<study>.+)" portion in the example above, then the corresponding values (in this example "study20, study21, study22") can be substituted into other property expressions. For instance, if these strings corresponded to the names of study subfolders under a "/Studies" project, then a
Move to container setting of:
/Studies/${study}/@pipeline/import/${now:date}
would resolve into:
/studies/study20/@pipeline/import/2017-11-07 (or similar)
This substitution allows the administrator to determine the destination folder based on the name, ensuring that the data is uploaded to the correct location.
Field | Value |
---|
Name | Load Samples into "study" folder |
Description | Moves and imports to the folder named |
Location | . |
File Pattern | sample_(.+)_(?<study>.+)\.tsv |
Move to container | /studies/${study}/@pipeline/import/${now:date} |
Example: Capture the Dataset Name
A File Watcher that matches .tsv/.xls files with "StudyA_" prefixed to the file name. For example, "StudyA_LabResults.tsv". Files are moved, and the data imported, to the StudyA folder. The <name> capture group determines the name of the dataset, so that "StudyA_LabResults.tsv" becomes the dataset "LabResults".
Field | Value |
---|
Name | Load StudyA |
Description | Moves and imports datasets to StudyA |
Pipeline Task | Import/reload study datasets using data file. |
Location | . |
File Pattern | StudyA_(?<name>.+)\.(?:tsv|xls) |
Move to container | StudyA |
Example: Capture both the Folder Destination and the Dataset Name
To distribute files like the following to different study folders:
StudyA_Demographics.tsv
StudyB_Demographics.tsv
StudyA_LabResults.tsv
StudyB_LabResults.tsv
Field | Value |
---|
Name | Load datasets to named study folders |
Location | watched |
Pipeline Task | Import/reload study datasets using data file. |
File Pattern | (?<folder>.+)_(?<name>.+)\.tsv |
Move to container | ${folder} |
Example: Using the Parameter Function
The
Parameter Function is a JavaScript function which is executed during the move. In the example below, the username is selected programmatically:
var userName = sourcePath.getNameCount() > 0 ? sourcePath.getName(0) : null;
var ret = {'pipeline, username': userName }; ret;
Related Topics