A wide range of scenarios can be addressed using transform scripts. For example:
Any scripting language that can be invoked via the command line and has the ability to read/write files is supported for transformation scripts, including:
Before you can run scripts, you must configure the necessary scripting engine on your server. If you are missing the necessary engine, or the desired engine does not have the script file extension you are using, you'll get an error message similar to one of these:Script engine for the extension '.pl' has not been registered.
A script engine implementation was not found for the specified QC script (my_transformation_script.py). Check configurations in the Admin Console.
In order to upload transform scripts and attach them to an assay design, the user must have either the Platform Developer or Site Administrator role. Once an authorized user has added a script, it will be run any time data is imported using that design.
Users who can edit assay designs but are not Platform Developers or Site Administrators will be able to edit other aspects of the design, but will not see the transformation script options.
Transformation and validation scripts invoked during data import follow this Script Execution Sequence:
1. A user imports assay result data and supplies run and batch properties.
2. The server uses that input to create:
5. If transformed data is available in the specified output location, the server uses it for subsequent steps; otherwise, the original data is used.
6. If multiple transform scripts are specified, the server invokes the other scripts in the order in which they are defined, passing sequentially transformed output as input to the next script.
7. Field-level validator and quality-control checks, including range and regular expression validation that are included in the assay definition are performed on the 'post-transformation' data.
8. If no errors have occurred, the run is loaded into the database.
Each assay design can be associated with one or more validation or transform scripts which are run in the order listed in the assay design.
For a walkthrough of configuring and using using a transform script that has already been developed for your assay type, follow this topic:
An example workflow for how to create an assay transform script in perl can be found in this topic:To use a transform script in an assay design, edit the design and click Add Script next to the Transform Scripts field. Note that you must have Platform Developer or Site Administrator to see or use this option.
When you add a transformation script using the assay designer, the script will be uploaded to a @scripts subdirectory of the file root, parallel to where other @files are stored. This separate location helps protect scripts from being modified or removed by unauthorized users, as only Platform Developers and Site Administrators will be able to access them.
Remove scripts from the design by selecting Remove path from the menu. Note that this does not remove the file itself, just removes the path from the assay design. You can also use Copy path to obtain the path for this script in order to apply it to another assay design.
To manage the actual script files, click Manage Script Files to open the @scripts location.
Here you can select and (Delete) the script files themselves.
You can customize a Files web part to show the @scripts location.
When you upload a transformation script to the assay designer, it is placed in the @scripts subdirectory of the local file root. The path is determined for you and displayed in the assay designer. This location is only visible to Site Administrators and users with the Platform Developer role, making it a secure place to locate script files.
If for some reason you have scripts located elsewhere on your system, or when you are creating a new design using the same transform script(s), you can specify the absolute path to the script instead of uploading it.
Use > Copy path from an existing assay design's transform script section, or find the absolute path of a script elsewhere in the File Repository.
In the file path, LabKey Server accepts either backslashes (the default Windows format) or forward slashes.
Example path to script:
/labkey/labkey/files/MyProject/MyAssayFolder/@scripts/MyTransformScript.R
When working on your own developer workstation, you can put the script file wherever you like, but using the assay designer interface to place it in the @scripts location will not only be more secure, but will also make it easier to deploy to a production server. These options also make iterative development against a remote server easier, since you can use a Web-DAV enabled file editor to directly edit the same script file that the server is calling.
Within the script, you can use the built-in substitution token "${srcDirectory}" which is automatically the directory where the script file is located.
The primary mechanism for communication between the LabKey Assay framework and the Transform script is the Run Properties file. The ${runInfo} substitution token tells the script code where to find this file. The script file should contain a line like
run.props = labkey.transform.readRunPropertiesFile("${runInfo}");
The run properties file contains three categories of properties:
1. Batch and run properties as defined by the user when creating an assay instance. These properties are of the format: <property name> <property value> <java data type>
for example,
gDarkStdDev 1.98223 java.lang.Double
An example Run Properties file to examine: runProperties.tsv
When the transform script is called these properties will contain any values that the user has typed into the "Batch Properties" and "Run Properties" sections of the import form. The transform script can assign or modify these properties based on calculations or by reading them from the raw data file from the instrument. The script must then write the modified properties file to the location specified by the transformedRunPropertiesFile property.
2. Context properties of the assay such as assayName, runComments, and containerPath. These are recorded in the same format as the user-defined batch and run properties, but they cannot be overwritten by the script.
3. Paths to input and output files. These are absolute paths that the script reads from or writes to. They are in a <property name> <property value> format without property types. The paths currently used are:
C:\labkey\files\transforms\@files\scripts\TransformAndValidationFiles\AssayId_22\42\runDataFile.tsv
From the runProperties.tsv, the transform script developer has two choices of the file to use as input to transform:
However, even when the "runDataFile" is successfully parsed, the script could still choose to read from and act upon the raw "runDataUploadedFile" if desired for any reason. For instance, if the original file is already in TSV format, the script could use either version.
If the data file cannot be preprocessed into a TSV, then the script developer must work with the originally uploaded "runDataUploadedFile" and provide the parsing and preprocessing into a TSV format. For instance, if the data includes a header "above" the actual data table, the script would need to skip that header and read the data into a TSV.
A Python example that loads the original imported "raw" results file...
fileRunProperties = open(filePathRunProperties, "r")
for l in fileRunProperties:
row = l.split()
if row[0] == "runDataUploadedFile":
filePathIn = row[1]
if row[0] == "runDataFile":
filePathOut = row[3]
… and one that loads the inferred TSV file:
fileRunProperties = open(filePathRunProperties, "r")
for l in fileRunProperties:
row = l.split()
if row[0] == "runDataFile":
filePathIn = row[1]
if row[0] == "runDataFile":
filePathOut = row[3]
Note that regardless of whether the preprocessing is successful, the path of the "runDataFile" .tsv will be included in the runProperties.tsv file, it will just be missing. You can catch this scenario by saving script data for debugging. The "runDataFile" property also has two more columns, the third being the full path to the "output" tsv file to use.
Information on run properties can be passed to a transform script in two ways. You can put a substitution token into your script to identify the run properties file, or you can configure your scripting engine to pass the file path as a command line argument. See Transformation Script Substitution Syntax for a list of available substitution tokens.
For example, using perl:
Option #1: Put a substitution token (${runInfo}) into your script and the server will replace it with the path to the run properties file. Here's a snippet of a perl script that uses this method:
# Open the run properties file. Run or upload set properties are not used by
# this script. We are only interested in the file paths for the run data and
# the error file.
open my $reportProps, '${runInfo}';
Option #2: Configure your scripting engine definition so that the file path is passed as a command line argument:
Subscribers to premium editions of LabKey Server can learn more with the example code in these topics: