This topic covers the configuration of scripting engines on LabKey Server.
Usage
Some ways to use scripting engines on LabKey Server.
- R, Java, Perl, or Python scripts can perform data validation or transformation during assay data upload (see: Transform Scripts).
- R scripts can provide advanced data analysis and visualizations for any type of data grid displayed on LabKey Server. For information on using R, see: R Reports. For information on configuring R beyond the instructions below, see: Install and Set Up R.
- R, Python, Perl, SAS, and others can be used invoked as part of a data processing pipeline job. For details see Script Pipeline: Running Scripts in Sequence.
Add an R Scripting Engine
- Select (Admin) > Site > Admin Console.
- Under Configuration, click Views and Scripting.
- Select Add > New R Engine from the drop-down menu.
- If an engine has already been added and needs to be edited, double-click the engine, or select it and then click Edit.
- Fill in the fields necessary to configure the scripting engine in the popup dialog box, for example:
- Enter the configuration information in the popup. See below for field details.
- Click Submit to save your changes and add the new engine.
- Click Done when finished adding scripting engines.
R configuration fields:
- Name: Choose a name for this engine, which will appear on the list.
- Language: Choose the language of the engine. Example: "R".
- File extensions: These extensions will be associated with this scripting engine.
- Do not include the . in the extensions you list, and separate multiple extensions with commas.
- Example: For R, choose "R,r" to associate the R engine with both uppercase (.R) and lowercase (.r) extensions.
- Program Path: Specify the absolute path of the scripting engine instance on your LabKey Server, including the program itself. Remember: The instance of the R program will be named "R.exe" on Windows. And simply "R" on Linux and OSX machines, for example:
- Program Command: This is the command used by LabKey Server to execute scripts created in an R view.
- Example: For R, you typically use the default command: CMD BATCH --slave. Both stdout and stderr messages are captured when using this configuration. The default command is sufficient for most cases and usually would not need to be modified.
- Another possible option is capture.output(source("%s")). This will only capture stdout messages, not stderr. You may use cat() instead of message() to write messages to stdout.
- Output File Name: If the console output is written to a file, the name should be specified here. The substitution syntax ${scriptName} will be replaced with the name (minus the extension) of the script being executed.
- If you are working with assay data, an alternative way to capture debugging information is to enable "Save Script Data" in your assay design: for details see Transform Scripts.
- Site Default: Check this box if you want this configuration to be the site default.
- Sandboxed: Check this box if you want to mark this configuration as sandboxed.
- Use pandoc and rmarkdown: Enable if you have rmarkdown and pandoc installed. If enabled, Markdown v2 will be used to render knitr R reports; if not enabled, Markdown v1 will be used. See R Reports with knitr
- Enabled: Please click this checkbox to enable the engine.
Multiple R Scripting Engine Configurations
More than one R scripting engine can be defined on a server site. For example, you might want to use different versions of R in different projects or different sets of R packages in different folders. You can also use different R engines inside the same folder, one to handle pipeline jobs and another to handle reports.
Use a unique name for each configuration you define.
You can mark one of your R engines as the
site default, which will be used if you don't specify otherwise in a given context. If you do override the site default in a container, then this configuration will be used in any child containers, unless you specify otherwise.
In each folder where you will use an R scripting engine, you can either:
- Use the site default R engine, which requires no intervention on your part.
- Or use the R engine configuration used by the parent project or folder.
- Or use alternate engines for the current folder. If you chose this option, you must further specify which engine to use for pipeline jobs and which to use for report rendering.
- To select an R configuration in a folder, navigate to the folder and select (Admin) > Folder > Management.
- Click the R Config tab.
- You will see the set of Available R Configurations.
- Options:
- Use parent R Configuration: (Default) The configuration used in the parent container is shown with a radio button. This will be the site default unless it was already overridden in the parent.
- Use folder level R configuration: To use a different configuration in this container, select this radio button.
- Reports: The R engine you select here will be used to render reports in this container. All R configurations defined on the admin console will be shown here.
- Pipeline Jobs: The R engine you select here will be used to run pipeline jobs and transform scripts. All R configurations defined on the admin console will be shown here.
- Click Save.
- In the example configuration below, different R engines are used to render reports and to run pipeline jobs.
Sandbox an R Engine
When you define an R engine configuration, you have the option to select whether it is
Sandboxed. Sandboxing is a software management strategy that isolates applications from critical system resources. It provides an extra layer of security to prevent harm from malware or other applications. By sandboxing an R engine configuration, you can grant the ability to edit R reports to a wider group of people.
An administrator should only mark a configuration as sandboxed when they are confident that their R configuration has been contained (using docker or another mechanism) and does not expose the native file system directly. LabKey will trust that by checking the box, the administrator has done the appropriate diligence to ensure the R installation is safely isolated from a malicious user. LabKey does not verify this programmatically.
The sandbox designation controls which security roles are required for users to create or edit scripts.
- If the box is checked, this engine is sandboxed. A user with either the "Trusted Analyst" or "Platform Developer" role will be able to create new R reports and/or update existing ones using this configuration.
- If the box is unchecked, the non-sandboxed engine is considered riskier by the server, so users must have the "Platform Developer" role to create and update using it.
Learn about these security roles in this topic:
Developer Roles
Check R Packages and Versions
From R you can run the following command to get installed package and version information:
as.data.frame(installed.packages()[,c(1,3)])
Add a Perl Scripting Engine
To add a Perl scripting engine, follow the same process as for an R configuration.
- Select (Admin) > Site > Admin Console.
- Under Configuration, click Views and Scripting.
- Select Add > New Perl Engine.
- Enter the configuration information in the popup. See below for field details.
- Click Submit.
You can only have a single Perl engine configuration. After one is defined, the option to define a new one will be grayed out. You may edit the existing configuration to make changes as needed.
Perl configuration fields:
- Name: Perl Scripting Engine
- Language: Perl
- Language Version: Optional
- File Extensions: pl
- Program Path: Provide the path, including the name of the program. For example, "/usr/bin/perl", or on Windows "C:\labkey\apps\perl\perl.exe".
- Program Command: Leave this blank
- Output File Name: Leave this blank
- Enabled
Add a Python Scripting Engine
To add a Python engine:
- Select (Admin) > Site > Admin Console.
- Under Configuration, click Views and Scripting.
- Select Add > New External Engine.
- Enter the configuration information in the popup.
- Program Path: Specify the absolute path of the scripting engine instance on your LabKey Server, including the program itself. In the image below, this engine is installed on windows in "C:\labkey\apps\python\python.exe".
- Program Command is used only if you need to pass any additional commands or arguments to Python or for any of the scripting engines. If left blank, we will just use the default option as when running Python via the command-line. In most cases, this field can be left blank, unless you need to pass in a Python argument. If used, we recommend adding quotes around this value, for example, "${runInfo}". This is especially important if your path to Python has spaces in it.
- Output File Name: If the console output is written to a file, the name should be specified here. The substitution syntax ${scriptName} will be replaced with the name (minus the extension) of the script being executed.
- If you are working with assay data, an alternative way to capture debugging information is to enable "Save Script Data" in your assay design: for details see Transform Scripts.
Check Python Packages and Versions
The command for learning which python packages are installed and what versions are in use can vary somewhat. Generally, you'll use a command like:
Note that the exact command to run will vary somewhat based on the system. Your operating system may bundle a version of pip, or you may have to install it via a system package manager. If you access python via "python3" you may run pip via "pip3". Some systems will have multiple versions of pip installed, and you will need to make sure you're using the version of pip associated with the Python binary that LabKey Server uses.
Running the correct version of pip correctly will return a list of installed libraries including their versions, the output will look something like this:
black==24.3.0
certifi==2023.11.17
charset-normalizer==3.3.2
click==8.1.7
coverage==7.4.4
docutils==0.20.1
et-xmlfile==1.1.0
Related Topics