This topic includes some examples of using
File Watchers.
Example: Dataset Creation and Import
Suppose you want to create a set of datasets based on Excel and TSV files, and load data into those datasets. To set this up, do the following:
- In the File web part create a directory named 'watched'. (It is important that you do this before saving the File Watcher configuration.)
- Prepare your Excel/TSV files to match the expectations of your study, especially, time point-style (date or visit), ParticipantId column name, and time column name. The name of your file should not include any numbers, only letters.
- Upload the file into the study's File Repository.
- Create a trigger to Import/reload study datasets using data file.
- Location to Watch: enter 'watched'.
- File Pattern: Leave blank. The default file pattern will be used, which is (^\D*)\.(?:tsv|txt|xls|xlsx) Note that this file pattern will not match file names which include numbers.
- When the trigger is enabled, datasets will be created and loaded in your study.
Field | Value |
---|
Name | Load MyStudy |
Description | Imports datasets to MyStudy |
Type | Pipeline File Watcher |
Pipeline Task | Import/reload study datasets using data file. |
Location to Watch | watched |
File Pattern | |
Move to container | |
Move to subdirectory | |
Example: Named Capture Group <study>
Consider a set of data with original filenames matching a format like this:
sample_2017-09-06_study20.tsv
sample_2017-09-06_study21.tsv
sample_2017-09-06_study22.tsv
An example
filePattern regular expression that would capture such filenames would be:
sample_(.+)_(?<study>.+)\.tsv
Files that match the pattern are acted upon, such as being moved and/or imported to tables in the server. Nothing happens to files that do not match the pattern.
If the regular expression contains
named capturing groups, such as the "(?<study>.+)" portion in the example above, then the corresponding value (in this example "study20") can be substituted into other property expressions. For instance, a
Move to container setting of:
/studies/${study}/@pipeline/import/${now:date}
would resolve into:
/studies/study20/@pipeline/import/2017-11-07 (or similar)
This substitution allows the administrator to determine the destination folder based on the name, ensuring that the data is uploaded to the correct location.
Field | Value |
---|
Name | Load StudyA |
Description | Moves and imports datasets to StudyA |
Type | Pipeline File Watcher |
Pipeline Task | Import/reload study datasets using data file. |
Location | . |
File Pattern | sample_(.+)_(?<study>.+)\.tsv |
Move to container | /studies/${study}/@pipeline/import/${now:date} |
Example: Capture the Dataset Name
A File Watcher that matches .tsv/.xls files with "StudyA_" prefixed to the file name. For example, "StudyA_LabResults.tsv". Files are moved, and the data imported, to the StudyA folder. The <name> capture group determines the name of the dataset, so that "StudyA_LabResults.tsv" becomes the dataset "LabResults".
Field | Value |
---|
Name | Load StudyA |
Description | Moves and imports datasets to StudyA |
Type | Pipeline File Watcher |
Pipeline Task | Import/reload study datasets using data file. |
Location | . |
File Pattern | StudyA_(?<name>.+)\.(?:tsv|xls) |
Move to container | StudyA |
Move to subdirectory | imported |
Example: Capture both the Folder Destination and the Dataset Name
To distribute files like the following to different study folders:
StudyA_Demographics.tsv
StudyB_Demographics.tsv
StudyA_LabResults.tsv
StudyB_LabResults.tsv
Field | Value |
---|
Name | Load datasets |
Location | watched |
File Pattern | (?<study>.+)_(?<name>.+)\.tsv |
Move to container | ${study} |
Move to subdirectory | imported |
Example: Using the Parameter Function
The
Parameter Function is a JavaScript function which is executed during the move. In the example below, the username is selected programmatically:
var userName = sourcePath.getNameCount() > 0 ? sourcePath.getName(0) : null;
var ret = {'pipeline, username': userName }; ret;
Related Topics