Table of Contents

guest
2022-09-24
     R Reports
       R Report Builder
       Saved R Reports
       R Reports: Access LabKey Data
       Multi-Panel R Plots
       Lattice Plots
       Participant Charts in R
       R Reports with knitr
       Input/Output Substitutions Reference
       Tutorial: Query LabKey Server from RStudio
       FAQs for LabKey R Reports

R Reports


You can leverage the full power of the R statistical programming environment to analyze and visualize datasets on LabKey Server. The results of R scripts can be displayed in LabKey reports that reflect live data updated every time the script is run. Reports may contain text, tables, or charts created using common image formats such as jpeg, png and gif. In addition, the Rlabkey package can be used to insert, update and/or delete data stored on a LabKey Server using R, provided you have sufficient permissions to do so.

An administrator must install and configure R on LabKey Server and grant access to users to create and run R scripts on live datasets. Loading of additional packages may also be necessary, as described in the installation topic. Configuration of multiple R engines on a server is possible, but within any folder only a single R engine configuration can be used.

Topics

Related Topics




R Report Builder


This topic describes how to build reports in the R statistical programming environment to analyze and visualize datasets on LabKey Server. The results of R scripts can be displayed in LabKey reports that reflect live data updated every time the script is run.
Permissions: Creating R Reports requires that the user have both the "Editor" role (or higher) and developer access (one of the roles "Platform Developer" or "Trusted Analyst") in the container. Learn more here: Developer Roles.

Create an R Report from a Data Grid

R reports are ordinarily associated with individual data grids. Choose the dataset of interest and further filter the grid as needed. Only the portion of the dataset visible within this data grid become part of the analyzed dataset.

To use the sample dataset we describe in this tutorial, please Tutorial: Set Up a New Study if you have not already done so. Alternately, you may simply add the PhysicalExam.xls demo dataset to an existing study for completing the tutorial. You may also work with your own dataset, in which case steps and screencaps will differ.

  • View the "Physical Exam" dataset in a LabKey study.
  • If you want to filter the dataset and thus select a subset or rearrangement of fields, select or create a custom grid view.
  • Select (Charts/Reports) > Create R Report.

If you do not see the "Create R Report" menu, check to see that R is installed and configured on your LabKey Server. You also need to have the correct permissions to create R Reports. See Configure Scripting Engines for more information.

Create an R Report Independent of any Data Grid

R reports do not necessarily need to be associated with individual data grids. You can also create an R report that is independent of any grid:

  • Select (Admin) > Manage Views.
  • Select Add Report > R Report.

R reports associated with a grid automatically load the grid data into the object "labkey.data". R reports created independently of grids do not have access to labkey.data objects. R reports that pull data from additional tables (other than the associated grid) must use the Rlabkey API to access the other table(s). For details on using Rlabkey, see Rlabkey Package. By default, R reports not associated with a grid are listed under the Uncategorized heading in the list on the Manage Views page.

Review the R Report Builder

The R report builder opens on the Source tab which looks like this. Enter the R script for execution or editing into the Script Source box. Notice the options available below the source entry panel, describe below.

Report Tab

When you select the Report tab, you'll see the resulting graphics and console output for your R report. If the pipeline option is not selected, the script will be run in batch mode on the server.

Data Tab

Select the data tab to see the data on which your R report is based. This can be a helpful resource as you write or refine your script.

Source Tab

When your script is complete and report is satisfactory, return to the Source tab, scroll down, and click Save to save both the script and the report you generated.

A saved report will look similar to the results in the design view tab, minus the help text. Reports are saved on the LabKey Server, not on your local file system. They can be accessed through the Reports drop-down menu on the grid view of you dataset, or directly from the Data Views web part.

The script used to create a saved report becomes available to source() in future scripts. Saved scripts are listed under the “Shared Scripts” section of the LabKey R report builder.

Help Tab

This Syntax Reference list provides a quick summary of the substitution parameters for LabKey R. See Input/Output Substitutions Reference for further details.

Additional Options

On the Source Tab you can expand additional option sections. Not all options are available to all users, based on permission roles granted.

Options

  • Make this report available to all users: Enables other users to see your R report and source() its associated script if they have sufficient permissions. Only those with read privileges to the dataset can see your new report based on it.
  • Show source tab to all users: This option is available if the report itself is shared.
  • Make this report available in child folders: Make your report available in data grids in child folders where the schema and table are the same as this data grid.
  • Run this report in the background as a pipeline job: Execute your script asynchronously using LabKey’s Pipeline module. If you have a big job, running it on a background thread will allow you to continue interacting with your server during execution.
If you choose the asynchronous option, you can see the status of your R report in the pipeline. Once you save your R report, you will be returned to the original data grid. From the Reports drop-down menu, select the report you just saved. This will bring up a page that shows the status of all pending pipeline jobs. Once your report finishes processing, you can click on “COMPLETE” next to your job. On the next page you’ll see "Job Status." Click on Data to see your report.

Note that reports are always generated from live data by re-running their associated scripts. This makes it particularly important to run computationally intensive scripts as pipeline jobs when their associated reports are regenerated often.

Knitr Options

  • Select None, HTML, or Markdown processing of HTML source
  • For Markdown, you can also opt to Use advanced rmarkdown output_options.
    • Check the box to provide customized output_options to be used.
    • If unchecked, rmarkdown will use the default output format:
      html_document(keep_md=TRUE, self_contained=FALSE, fig_caption=TRUE, theme=NULL, css=NULL, smart=TRUE, highlight='default')
  • Add a semi-colon delimited list of JavaScript, CSS, or library dependencies if needed.
Report Thumbnail
  • Choose to auto-generate a default thumbnail if desired. You can later edit the thumbnail or attach a custom image. See Manage Views.
Shared Scripts
  • Once you save an R report, its associated script becomes available to execute using source(“<Script Name>.R”) in future scripts.
  • Check the box next to the appropriate script to make it available for execution in this script.
Study Options
  • Participant Chart: A participant chart shows measures for only one participant at a time. Select the participant chart checkbox if you would like this chart to be available for review participant-by-participant.
  • Automatically cache this report for faster reloading: Check to enable.
Click Save to save settings, or Save As to save without disturbing the original saved report.

Example

Regardless of where you have accessed the R report builder, you can create a first R report which is data independent. This sample was adapted from the R help files.

  • Paste the following into the Source tab of the R report builder.
options(echo=TRUE);
# Execute 100 Bernoulli trials;
coin_flip_results = sample(c(0,1), 100, replace = TRUE);
coin_flip_results;
mean(coin_flip_results);
  • Click the Report tab to run the source and see your results, in this case the coin flip outcomes.

Add or Suppress Console Output

The options covered below can be included directly in your R report. There are also options related to console output in the scripting configuration for your R engine.

Echo to Console

By default, most R commands do not generate output to the console as part of your script. To enable output to console, use the following line at the start of your scripts:

options(echo=TRUE);

Note that when the results of functions are assigned, they are also not printed to the console. To see the output of a function, assign the output to a variable, then just call the variable. For further details, please see the FAQs for LabKey R Reports.

Suppress Console Output

To suppress output to the console, hiding it from users viewing the script, first remove the echo statement shown above. You can also include sink to redirect any outputs to 'nowhere' for all or part of your script.

To suppress output, on Linux/Mac/Unix, use:

sink("/dev/null")

On Windows use:

sink("NUL")

When you want to restart output to the console within the script, use sink again with no argument:

sink()

Related Topics




Saved R Reports


Saved R reports may be accessed from the source data grid or from the Data Views web part. This topic describes how to manage saved R reports and how they can be shared with other users (who already have access to the underlying data).

Performance Note

Once saved, reports are generated by re-running their associated scripts on live data. This ensures users always have the most current views, but it also requires computational resources each time the view is opened. If your script is computationally intensive, you can set it to run in the background so that it does not overwhelm your server when selected for viewing. Learn more in this topic: R Report Builder.

Edit an R Report Script

Open your saved R report by clicking the name in the data views web part or by selecting it from the (Charts and Reports) menu above the data grid on which it is based. This opens the R report builder interface on the Data tab. Select the Source tab to edit the script and manage sharing and other options. Click Save when finished.

Share an R Report

Saved R Reports can be kept private to the author, or shared with other users, either with all users of the folder, or individually with specific users. Under Options in the R report builder, use the Make this report available to all users checkbox to control how the report is shared.

  • If the box is checked, the report will be available to any users with "Read" access (or higher) in the folder. This access level is called "public" though that does not mean shared with the general public (unless they otherwise have "Read" access).
  • If the box is unchecked, the report is "private" to the creator, but can still be explicitly shared with other individual users who have access to the folder.
  • An otherwise "private" report that has been shared with individual users or groups has the access level "custom".
When sharing a report, you are indicating that you trust the recipient(s), and your recipients confirm that they trust you when they accept it. Sharing of R reports is audited and can be tracked in the "Study events" audit log.

Note that if a report is "public", i.e was made available to all users, you can still use this mechanism to email a copy of it to a trusted individual, but that will not change the access level of the report overall.

  • Open an R Report, from the Data Views web part, and click (Share Report).
  • Enter the Recipients email addresses, one per line.
  • The default Message Subject and Message Body are shown. Both can be customized as needed.
  • The Message Link is shown; you can click Preview Link to see what the recipient will see.
  • Click Submit to share the report. You will be taken to the permissions page.
  • On the Report and View Permissions page, you can see which groups and users already had access to the report.
    • Note that you will not see the individuals you are sharing the report with unless the access level of it was "custom" or "private" prior to sharing it now.
  • Click Save.

Recipients will receive a notification with a link to the report, so that they may view it. If the recipient has the proper permissions, they will also be able to edit and save their own copy of the report. If the author makes the source tab visible, recipients of a shared report will be able to see the source as well as the report contents. Note that if the recipient has a different set of permissions, they may see a different set of data. Modifications that the original report owner makes to the report will be reflected in the link as viewed by the recipient.

When an R report was private but has been shared, the data browser will show access as "custom". Click custom to open the Report Permissions page, where you can see the list of groups and users with whom the report was shared.

Learn more about report permissions in this topic: Configure Permissions for Reports & Views

Delete an R Report

You can delete a saved report by first clicking the pencil icon at the top of the Data Views web part, then click the pencil to the left of the report name. In the popup window, click Delete. You can also multi-select R reports for deletion on the Manage Views page.

Note that deleting a report eliminates its associated script from the "Shared Scripts" list in the R report interface. Make sure that you don’t delete a script that is called (sourced) by other scripts you need.

Related Topics




R Reports: Access LabKey Data


Access Your Data as "labkey.data"

LabKey Server automatically reads your chosen dataset into a data frame called labkey.data using Input Substitution.

A data frame can be visualized as a list with unique row names and columns of consistent lengths. Column names are converted to all lower case, spaces or slashes are replaced with underscores, and some special characters are replaced with words (i.e. "CD4+" becomes "cd4_plus_"). You can see the column names for the built in labkey.data frame by calling:

options(echo=TRUE);
names(labkey.data);

Just like any other data.frame, data in a column of labkey.data can be referenced by the column's name, converted to all lowercase and preceded by a $:

labkey.data$<column name>

For example, labkey.data$pulse; provides all the data in the Pulse column. Learn more about column references below.

Note that the examples in this section frequently include column names. If you are using your own data or a different version of LabKey example data, you may need to retrieve column names and edit the code examples given.

Use Pre-existing R Scripts

To use a pre-existing R script with LabKey data, try the following procedure:

  • Open the R Report Builder:
    • Open the dataset of interest ("Physical Exam" for example).
    • Select > Create R Report.
  • Paste the script into the Source tab.
  • Identify the LabKey data columns that you want to be represented by the script, and load those columns into vectors. The following loads the Systolic Blood Pressure and Diastolic Blood Pressure columns into the vectors x and y:
x <- labkey.data$diastolicbp;
y <- labkey.data$systolicbp;

png(filename="${imgout:myscatterplot}", width = 650, height = 480);
plot(x,
y,
main="Scatterplot Example",
xlab="X Axis ",
ylab="Y Axis",
pch=19);
abline(lm(y~x), col="red") # regression line (y~x);
  • Click the Report tab to see the result:

Find Simple Means

Once you have loaded your data, you can perform statistical analyses using the functions/algorithms in R and its associated packages. For example, calculate the mean Pulse for all participants.

options(echo=TRUE);
names(labkey.data);
labkey.data$pulse;
a <- mean(labkey.data$pulse, na.rm= TRUE);
a;

Find Means for Each Participant

The following simple script finds the average values of a variety of physiological measurements for each study participant.

# Get means for each participant over multiple visits;

options(echo=TRUE);
participant_means <- aggregate(labkey.data, list(ParticipantID = labkey.data$participantid), mean, na.rm = TRUE);
participant_means;

We use na.rm as an argument to aggregate in order to calculate means even when some values in a column are NA.

Create Functions in R

This script shows an example of how functions can be created and called in LabKey R scripts. Before you can run this script, the Cairo package must be installed on your server. See Install and Set Up R for instructions.

Note that the second line of this script creates a "data" copy of the input file, but removes all participant records that contain an NA entry. NA entries are common in study datasets and can complicate display results.

library(Cairo);
data= na.omit(labkey.data);

chart <- function(data)
{
plot(data$pulse, data$pulse);
};

filter <- function(value)
{
sub <- subset(labkey.data, labkey.data$participantid == value);
#print("the number of rows for participant id: ")
#print(value)
#print("is : ")
#print(sub)
chart(sub)
}

names(labkey.data);
Cairo(file="${imgout:a}", type="png");
layout(matrix(c(1:4), 2, 2, byrow=TRUE));
strand1 <- labkey.data[,1];
for (i in strand1)
{
#print(i)
value <- i
filter(value)
};
dev.off();

Access Data in Another Dataset (Select Rows)

You can use the Rlabkey library's selectRows to specify the data to load into an R data frame, including labkey.data, or a frame named something else you choose.

For example, if you use the following, you will load some example fictional data from our public demonstration site that will work with the above examples.

library(Rlabkey)
labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="PhysicalExam",
viewName="",
colNameOpt="rname"
)

Convert Column Names to Valid R Names

Include colNameOpt="rname" to have the selectRows call provide "R-friendly" column names. This converts column names to lower case and replaces spaces or slashes with underscores. Note that this may be different from the built in column name transformations in the built in labkey.data frame. The built in frame also substitutes words for some special characters, i.e. "CD4+" becomes "cd4_plus_", so during report development you'll want to check using names(labkey.data); to be sure your report references the expected names.

Learn more in the Rlabkey Documentation.

Select Specific Columns

Use the colSelect option with to specify the set of columns you want to add to your dataframe. Make sure there are no spaces between the commas and column names.

In this example, we load some fictional example data, selecting only a few columns of interest.

library(Rlabkey)
labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="Demographics",
viewName="",
colSelect="ParticipantId,date,cohort,height,Language",
colFilter=NULL,
containerFilter=NULL,
colNameOpt="rname"
)

Display Lookup Target Columns

If you load the above example, and then execute: labkey.data$language; you will see all the data in the "Language" column.

Remember that in an R data frame, columns are referenced in all lowercase, regardless of casing in LabKey Server. For consistency in your selectRows call, you can also define the colSelect list in all lowercase, but it is not required.

In this case, it will return a series of integers, because "Language" is a lookup column that references a list in the same container with an incrementing integer primary key.

If you want to access a column that is not the primary key in the lookup target, such as human-readable display values in this example, use syntax like this in your selectRows:

library(Rlabkey)
labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="Demographics",
viewName="",
colSelect="ParticipantId,date,cohort,height,Language,Language/LanguageName,Language/TranslatorName,Language/TranslatorPhone",
colFilter=NULL,
containerFilter=NULL,
colNameOpt="rname"
)

You can now retrieve human-readable values from within the "Language" list by converting everything to lowercase and substituting an underscore for the slash. Executing labkey.data$language_languagename; will return the list of language names.

Access URL Parameters and Data Filters

While you are developing your report, you can acquire any URL parameters as well as any filters applied on the Data tab by using labkey.url.params.

For example, if you filter the "systolicBP" column to values over 100, then use:

print(labkey.url.params}

...your report will include:

$`Dataset.systolicBP~gt`
[1] "100"

Write Result File to File Repository

The following report, when run, creates a result file in the server's file repository. Note that fileSystemPath is an absolute file path. To get the absolute path, see Using the Files Repository.

fileSystemPath = "/labkey/labkey/MyProject/Subfolder/@files/"
filePath = paste0(fileSystemPath, "test.tsv");
write.table(labkey.data, file = filePath, append = FALSE, sep = "t", qmethod = "double", col.names=NA);
print(paste0("Success: ", filePath));

Related Topics




Multi-Panel R Plots


The scripts on this page take the analysis techniques introduced in R Reports: Access LabKey Data one step further, still using the Physical Exam sample dataset. This page covers a few more strategies for finding means, then shows how to graph these results and display least-squares regression lines.

Find Mean Values for Each Participant

Finding the mean value for physiological measurements for each participant across all visits can be done in various ways. Here, we cover three alternative methods.

For all methods, we use "na.rm=TRUE" as an argument to aggregate in order to ignore null values when we calculate means.

DescriptionCode
Aggregate each physiological measurement for each participant across all visits; produces an aggregated list with two columns for participantid.
data_means <- aggregate(labkey.data, list(ParticipantID = 
labkey.data$participantid), mean, na.rm = TRUE);
data_means;
Aggregate only the pulse column and display two columns: one listing participantIDs and the other listing mean values of the pulse column for each participant
aggregate(list(Pulse = labkey.data$pulse), 
list(ParticipantID = labkey.data$participantid), mean, na.rm = TRUE);
Again, aggregate only the pulse column, but here results are displayed as rows instead of two columns.
participantid_factor <- factor(labkey.data$participantid);
pulse_means <- tapply(labkey.data$pulse, participantid_factor,
mean, na.rm = TRUE);
pulse_means;

Create Single Plots

Next we use R to create plots of some other physiological measurements included in our sample data.

All scripts in this section use the Cairo package. To convert these scripts to use the png() function instead, eliminate the call "library(Cairo)", change the function name "Cairo" to "png," change the "file" argument to "filename," and eliminate the "type="png"" argument entirely.

Scatter Plot of All Diastolic vs All Systolic Blood Pressures

This script plots diastolic vs. systolic blood pressures without regard for participantIDs. It specifies the "ylim" parameter for plot() to ensure that the axes used for this graph match the next graph's axes, easing interpretation.

library(Cairo);
Cairo(file="${imgout:diastol_v_systol_figure.png}", type="png");
plot(labkey.data$diastolicbloodpressure, labkey.data$systolicbloodpressure,
main="R Report: Diastolic vs. Systolic Pressures: All Visits",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200));
abline(lsfit(labkey.data$diastolicbloodpressure, labkey.data$systolicbloodpressure));
dev.off();

The generated plot, where the identity of participants is ignored, might look like this:

Scatter Plot of Mean Diastolic vs Mean Systolic Blood Pressure for Each Participant

This script plots the mean diastolic and systolic blood pressure readings for each participant across all visits. To do this, we use "data_means," the mean value for each physiological measurement we calculated earlier on a participant-by-participant basis.

data_means <- aggregate(labkey.data, list(ParticipantID = 
labkey.data$participantid), mean, na.rm = TRUE);
library(Cairo);
Cairo(file="${imgout:diastol_v_systol_means_figure.png}", type="png");
plot(data_means$diastolicbloodpressure, data_means$systolicbloodpressure,
main="R Report: Diastolic vs. Systolic Pressures: Means",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200));
abline(lsfit(data_means$diastolicbloodpressure, data_means$systolicbloodpressure));
dev.off();

This time, the plotted regression line for diastolic vs. systolic pressures shows a non-zero slope. Looking at our data on a participant-by-participant basis provides insights that might be obscured when looking at all measurements in aggregate.

Create Multiple Plots

There are two ways to get multiple images to appear in the report produced by a single script.

Single Plot Per Report Section

The first and simplest method of putting multiple plots in the same report places separate graphs in separate sections of your report. Use separate pairs of device on/off calls (e.g., png() and dev.off()) for each plot you want to create. You have to make sure that the {imgout:} parameters are unique. Here's a simple example:

png(filename="${imgout:labkeyl_png}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R: Report Section 1");
dev.off();

png(filename="${imgout:labkey2_png}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R: Report Section 2");
dev.off();

Multiple Plots Per Report Section

There are various ways to place multiple plots in a single section of a report. Two examples are given here, the first using par() and the second using layout().

Example: Four Plots in a Single Section: Using par()

This script demonstrates how to put multiple plots on one figure to create a regression panel layout. It uses standard R libraries for the arrangement of plots, and Cairo for creation of the plot image itself. It creates a single graphics file but partitions the ‘surface’ of the image into multiple sections using the mfrow and mfcol arguments to par().

library(Cairo);
data_means <- aggregate(labkey.data, list(ParticipantID =
labkey.data$participantid), mean, na.rm = TRUE);
Cairo(file="${imgout:multiplot.png}", type="png")
op <- par(mfcol = c(2, 2)) # 2 x 2 pictures on one plot
c11 <- plot(data_means$diastolicbloodpressure, data_means$weight, ,
xlab="Diastolic Blood Pressure (mm Hg)", ylab="Weight (kg)",
mfg=c(1, 1))
abline(lsfit(data_means$diastolicbloodpressure, data_means$weight))
c21 <- plot(data_means$diastolicbloodpressure, data_means$systolicbloodpressure, ,
xlab="Diastolic Blood Pressure (mm Hg)",
ylab="Systolic Blood Pressure (mm Hg)", mfg= c(2, 1))
abline(lsfit(data_means$diastolicbloodpressure, data_means$systolicbloodpressure))
c21 <- plot(data_means$diastolicbloodpressure, data_means$pulse, ,
xlab="Diastolic Blood Pressure (mm Hg)",
ylab="Pulse Rate (Beats/Minute)", mfg= c(1, 2))
abline(lsfit(data_means$diastolicbloodpressure, data_means$pulse))
c21 <- plot(data_means$diastolicbloodpressure, data_means$temp, ,
xlab="Diastolic Blood Pressure (mm Hg)",
ylab="Temperature (Degrees C)", mfg= c(2, 2))
abline(lsfit(data_means$diastolicbloodpressure, data_means$temp))
par(op); #Restore graphics parameters
dev.off();

Example: Three Plots in a Single Section: Using layout()

This script uses the standard R libraries to display multiple plots in the same section of a report. It uses the layout() command to arrange multiple plots on a single graphics surface that is displayed in one section of the script's report.

The first plot shows blood pressure and weight progressing over time for all participants. The lower scatter plots graph blood pressure (diastolic and systolic) against weight.

library(Cairo);
Cairo(file="${imgout:a}", width=900, type="png");
layout(matrix(c(3,1,3,2), nrow=2));
plot(weight ~ systolicbloodpressure, data=labkey.data);
plot(weight ~ diastolicbloodpressure, data=labkey.data);
plot(labkey.data$date, labkey.data$systolicbloodpressure, xaxt="n",
col="red", type="n", pch=1);
points(systolicbloodpressure ~ date, data=labkey.data, pch=1, bg="light blue");
points(weight ~ date, data=labkey.data, pch=2, bg="light blue");
abline(v=labkey.data$date[3]);
legend("topright", legend=c("bpsys", "weight"), pch=c(1,2));
dev.off();

Related Topics




Lattice Plots


The "lattice" R package provides presentation-quality, multi-plot graphics. This page supplies a simple script to demonstrate the use of Lattice graphics in the LabKey R environment.

Before you can use the Lattice package, it must be installed on your server. You will load the lattice package at the start of every script that uses it:

library("lattice");

Display a Volcano

The Lattice Documentation on CRAN provides a Volcano script to demonstrate the power of Lattice. The script below has been modified to work on LabKey R:

library("lattice");  

p1 <- wireframe(volcano, shade = TRUE, aspect = c(61/87, 0.4),
light.source = c(10,0,10), zlab=list(rot=90, label="Up"),
ylab= "North", xlab="East", main="The Lattice Volcano");
g <- expand.grid(x = 1:10, y = 5:15, gr = 1:2);
g$z <- log((g$x^g$g + g$y^2) * g$gr);

p2 <- wireframe(z ~ x * y, data = g, groups = gr,
scales = list(arrows = FALSE),
drape = TRUE, colorkey = TRUE,
screen = list(z = 30, x = -60));

png(filename="${imgout:a}", width=500);
print(p1);
dev.off();

png(filename="${imgout:b}", width=500);
print(p2);
dev.off();

The report produced by this script will display two graphs that look like the following:

Related Topics




Participant Charts in R


You can use the Participant Chart checkbox in the R Report Builder to create charts that display your R report results on a participant-by-participant basis. If you wish to create a participant chart in a test environment, install the example study and use it as a development sandbox.

Create and View Simple Participant Charts

  • In the example study, open the PhysicalExam dataset.
  • Select (Charts/Reports) > Create R Report.
  • On the Source tab, begin with a script that shows data for all participants. Paste the following in place of the default content.
png(filename="${imgout:a}", width=900);
plot(labkey.data$systolicbp, labkey.data$date);
dev.off();
  • Click the Report tab to view the scatter plot data for all participants.
  • Return to the Source tab.
  • Scroll down and click the triangle to open the Study Options section.
  • Check Participant Chart.
  • Click Save.
  • Name your report "Participant Systolic" or another name you choose.

The participant chart option subsets the data that is handed to an R script by filtering on a participant ID. You can later step through per participant charts using this option. The labkey.data dataframe may contain one, or more rows of data depending on the content of the dataset you are working with. Next, reopen the R report:

  • Return to the data grid of the "PhysicalExam" dataset.
  • Select (Charts/Reports) > Participant Systolic (or the name you gave your report).
  • Click Previous Participant.
  • You will see Next Participant and Previous Participant links that let you step through charts for each participant:

Advanced Example: Create Participant Charts Using Lattice

You can create a panel of charts for participants using the lattice package. If you select the participant chart option on the source tab, you will be able to see each participant's panel individually when you select the report from your data grid.

The following script produces lattice graphs for each participant showing systolic blood pressure over time:

library(lattice);
png(filename="${imgout:a}", width=900);
plot.new();
xyplot(systolicbp ~ date| participantid, data=labkey.data,
type="a", scales=list(draw=FALSE));
update(trellis.last.object(),
strip = strip.custom(strip.names = FALSE, strip.levels = TRUE),
main = "Systolic over time grouped by participant",
ylab="Systolic BP", xlab="");
dev.off();

The following script produces lattice graphics for each participant showing systolic and diastolic blood pressure over time (points instead of lines):

library(lattice);
png(filename="${imgout:b}", width=900);
plot.new();

xyplot(systolicbp + diastolicbp ~ date | participantid,
data=labkey.data, type="p", scales=list(draw=FALSE));
update(trellis.last.object(),
strip = strip.custom(strip.names = FALSE, strip.levels = TRUE),
main = "Systolic & Diastolic over time grouped by participant",
ylab="Systolic/Diastolic BP", xlab="");
dev.off();

After you save these two R reports with descriptive names, you can go back and review individual graphs participant-by-participant. Use the (Reports) menu available on your data grid.

Related Topics




R Reports with knitr


The knitr visualization package can be used with R in either HTML or Markdown pages to create dynamic reports. This topic will help you get started with some examples of how to interweave R and knitr.

Topics

Install R and knitr

  • If you haven't already installed R, follow these instructions: Install R.
  • Open the R graphical user interface. On Windows, a typical location would be: C:\Program Files\R\R-3.0.2\bin\i386\Rgui.exe
  • Select Packages > Install package(s).... Select a mirror site, and select the knitr package.
  • OR enter the following: install.packages('knitr', dependencies=TRUE)
    • Select a mirror site and wait for the knitr installation to complete.

Develop knitr Reports

  • Go to the dataset you wish to visualize.
  • Select (Charts/Reports) > Create R Report.
  • On the Source tab, enter your HTML or Markdown page with knitr code. (Scroll down for example pages.)
  • Specify which source to process with knitr. Under knitr Options, select HTML or Markdown.
  • Select the Report tab to see the results.

Advanced Markdown Options

If you are using rmarkdown v2 and check the box to "Use advanced rmarkdown output_options (pandoc only)", you can enter a list of param=value pairs in the box provided. Enter the bolded portion of the following example, which will be enclosed in an "output_options=list()" call.

output_options=list(
  param1=value1,
  param2=value2

)

Supported options include those in the "html_document" output format. Learn more in the rmarkdown documentation here .

Sample param=value options you can include:

  • css: Specify a custom stylesheet to use.
  • fig_width and fig_height: Control the size of figures included.
  • fig_caption: Control whether figures are captioned.
  • highlight: Specify a syntax highlighting style, such as pygments, monochrome, haddock, or default. NULL will prevent syntax highlighting.
  • theme: Specify the Bootstrap theme to apply to the page.
  • toc: Set to TRUE to include a table of contents.
  • toc_float: Set to TRUE to float the table of contents to the left.
If the box is unchecked, or if you check the box but provide no param=value pairs, rmarkdown will use the default output format:
​html_document(
keep_md=TRUE,
self_contained=FALSE,
fig_caption=TRUE,
theme=NULL,
css=NULL,
smart=TRUE,
highlight="default")

Note that pandoc is only supported for rmarkdown v2, and some formats supported by pandoc are not supported here.

If you notice report issues, such as graphs showing as small thumbnails, you may need to upgrade your server's version of pandoc.

R/knitr Scripts in Modules

R script knitr reports are also available as custom module reports. The script file must have either a .rhtml or .rmd extension, for HTML or markdown documents, respectively. For a file-based module, place the .rhtml/.rmd file in the same location as .r files, as shown below. For module details, see Map of Module Files.

MODULE_NAME
reports/
schemas/
SCHEMA_NAME/
QUERY_NAME/
MyRScript.r -- R report
MyRScript.rhtml -- R/knitr report
MyRScript.rmd -- R/knitr report

Declaring Script Dependencies

To fully utilize the report designer (called the "R Report Builder" in the LabKey user interface), you can declare JavaScript or CSS dependencies for knitr reports. This ensures that the dependencies are downloaded before R scripts are run on the "reports" tab in the designer. If these dependencies are not specified then any JavaScript in the knitr report may not run correctly in the context of the script designer. Note that reports that are run in the context of the Reports web part will still render correctly without needing to explicitly define dependencies.

Reports can either be created via the LabKey Server UI in the report designer directly or included as files in a module. Reports created in the UI are editable via the Source tab of the designer. Open Knitr Options to see a text box where a semi-colon delimited list of dependencies can be entered. Dependencies can be external (via HTTP) or local references relative to the labkeyWebapp path on the server. In addition, the name of a client library may be used. If the reference does not have a .js or .css extension then it will be assumed to be a client library (somelibrary.lib.xml). The .lib.xml extension is not required. Like local references, the path to the client library is relative to the labkeyWebapp path.

File based reports in a module cannot be edited in the designer although the "source" tab will display them. However you can still add a dependencies list via the report's metadata file. Dependencies can be added to these reports by including a <dependencies> section underneath the <R> element. A sample metadata file:

<?xml version="1.0" encoding="UTF-8"?>
<ReportDescriptor xmlns="http://labkey.org/query/xml">
<label>My Knitr Report</label>
<description>Relies on dependencies to display in the designer correctly.</description>
<reportType>
<R>
<dependencies>
<dependency path="http://external.com/jquery/jquery-1.9.0.min.js"/>
<dependency path="knitr/local.js"/>
<dependency path="knitr/local.css"/>
</dependencies>
</R>
</reportType>
</ReportDescriptor>

The metadata file must be named <reportname>.report.xml and be placed alongside the report of the same name under (modulename/resources/reports/schemas/...).

HTML Example

To use this example:

  • Install the R package ggplot2
  • Install the Demo Study.
  • Create an R report on the dataset "Physical Exam"
  • Copy and paste the knitr code below into the Source tab of the R Report Builder.
  • Scroll down to the Knitr Options node, open the node, and select HTML.
  • Click the Report tab to see the knitr report.
<table>
<tr>
<td align='center'>
<h2>Scatter Plot: Blood Pressure</h2>
<!--begin.rcode echo=FALSE, warning=FALSE
library(ggplot2);
opts_chunk$set(fig.width=10, fig.height=6)
end.rcode-->
<!--begin.rcode blood-pressure-scatter, warning=FALSE, message=FALSE, echo=FALSE, fig.align='center'
qplot(labkey.data$diastolicbp, labkey.data$systolicbp,
main="Diastolic vs. Systolic Pressures: All Visits",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200), xlim=c(60,120), color=labkey.data$temp);
end.rcode-->
</td>
<td align='center'>
<h2>Scatter Plot: Body Temp vs. Body Weight</h2>
<!--begin.rcode temp-weight-scatter, warning=FALSE, message=FALSE, echo=FALSE, fig.align='center'
qplot(labkey.data$temp, labkey.data$weight,
main="Body Temp vs. Body Weight: All Visits",
xlab="Body Temp (C)", ylab="Body Weight (kg)", xlim=c(35,40), color=labkey.data$height);
end.rcode-->
</td>
</tr>
</table>

The rendered knitr report:

Markdown v2

Administrators can enable Markdown v2 when enlisting an R engine through the Views and Scripting Configuration page. When enabled, Markdown v2 will be used when rendering knitr R reports. If not enabled, Markdown v1 is used to execute the reports.

Independent installation is required of the following:

This will then enable using the Rmarkdown v2 syntax for R reports. The system does not currently perform any verification of the user's setup. If the configuration is enabled when enlisting the R engine, but the packages are not properly setup, the intended report rendering will fail.

Syntax differences are noted here: http://rmarkdown.rstudio.com/authoring_migrating_from_v1.html

Markdown v1 Example

# Scatter Plot: Blood Pressure
# The chart below shows data from all participants

```{r setup, echo=FALSE}
# set global chunk options: images will be 7x5 inches
opts_chunk$set(fig.width=7, fig.height=5)
```

```{r graphic1, echo=FALSE}
plot(labkey.data$diastolicbp, labkey.data$systolicbp,
main="Diastolic vs. Systolic Pressures: All Visits",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200));
abline(lsfit(labkey.data$diastolicbp, labkey.data$systolicbp));
```

Another example

# Scatter Plot: Body Temp vs. Body Weight
# The chart below shows data from all participants.

```{r graphic2, echo=FALSE}
plot(labkey.data$temp, labkey.data$weight,
main="Temp vs. Weight",
xlab="Body Temp (C)", ylab="Body Weight (kg)", xlim=c(35,40));
```

Related Topics




Input/Output Substitutions Reference


An R script uses input substitution parameters to generate the names of input files and to import data from a chosen data grid. It then uses output substitution parameters to either directly place image/data files in your report or to include download links to these files. Substitutions take the form of: ${param} where 'param' is the substitution. You can find the substitution syntax directly in the R Report Builder on the Help tab.

Input and Output Substitution Parameters

Valid Substitutions: 
input_data: <name>The input datset, a tab-delimited table. LabKey Server automatically reads your input dataset (a tab-delimited table) into the data frame called labkey.data. If you desire tighter control over the method of data upload, you can perform the data table upload yourself. The 'input data:' prefix indicates that the data file for the grid and the <name> substitution can be set to any non-empty value:
# ${labkey.data:inputTsv}
labkey.data <- read.table("inputTsv", header=TRUE, sep="\t");
labkey.data
imgout: <name>An image output file (such as jpg, png, etc.) that will be displayed as a Section of a View on LabKey Server. The 'imgout:' prefix indicates that the output file is an image and the <name> substitution identifies the unique image produced after you call dev.off(). The following script displays a .png image in a View:
# ${imgout:labkey1.png}
png(filename="labkeyl_png")
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R")
dev.off()
tsvout: <name>A TSV text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. For example:
# ${tsvout:tsvfile}
write.table(labkey.data, file = "tsvfile", sep = "\t",
qmethod = "double", col.names="NA")
txtout: <name>A text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. For example:
# ${txtout:tsvfile}
write.csv(labkey.data, file = "csvfile")
pdfout: <name>A PDF output file that can be downloaded from LabKey Server. The 'pdfout:' prefix indicates that he expected output is a pdf file. The <name> substitution identifies the unique file produced after you call dev.off().
# ${pdfout:labkey1.pdf}
pdf(file="labkeyl_pdf")
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R")
dev.off()
psout: <name>A postscript output file that can be downloaded from LabKey Server. The 'psout:' prefix indicates that the expected output is a postscript file. The <name> substitution identifies the unique file produced after you call dev.off().
# ${psout:labkeyl.eps}
postscript(file="labkeyl.eps", horizontal=FALSE, onefile=FALSE)
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R")
dev.off()
fileout: <name>A file output that can be downloaded from LabKey Server, and may be of any file type. For example, use fileout in the place of tsvout to allow users to download a TSV instead of seeing it within the page:
# ${fileout:tsvfile}
write.table(labkey.data, file = "tsvfile", sep = "\t",
qmethod = "double", col.names=NA)
htmlout: <name>A text file that is displayed on LabKey Server as a section within a View. The output is different from the txtout: replacement in that no html escaping is done. This is useful when you have a report that produces html output. No downloadable file is created:
txt <- paste("<i>Click on the link to visit LabKey:</i>
<a target='blank' href='http://www.labkey.org'>LabKey</a>"
)
# ${htmlout:output}
write(txt, file="output")
svgout: <name>An svg file that is displayed on LabKey Server as a section within a View. htmlout can be used to render svg outputs as well, however, using svgout will generate a more appropriate thumbnail image for the report. No downloadable file is created:
# ${svgout:output.svg}
svg("output.svg", width= 4, height=3)
plot(x=1:10,y=(1:10)^2, type='b')
dev.off()

Implicit Variables

Each R script contains implicit variables that are inserted before your source script. Implicit variables are R data types and may contain information that can be used by the source script.

Implicit variables: 
labkey.dataThe data frame into which the input dataset is automatically read. The code to generate the data frame is:
# ${input_data:inputFileTsv} 
labkey.data <- read.table("inputFileTsv", header=TRUE, sep="\t",
quote="", comment.char="")
Learn more in R Reports: Access LabKey Data.
labkey.url.pathThe path portion of the current URL which omits the base context path, action and URL parameters. The path portion of the URL: http://localhost:8080/labkey/home/test/study-begin.view would be: /home/test/
labkey.url.baseThe base portion of the current URL. The base portion of the URL: http://localhost:8080/labkey/home/test/study-begin.view would be: http://localhost:8080/labkey/
labkey.url.paramsThe list of parameters on the current URL and in any data filters that have been applied. The parameters are represented as a list of key / value pairs.
labkey.user.emailThe email address of the current user

Using Regular Expressions with Replacement Token Names

Sometimes it can be useful to have flexibility when binding token names to replacement parameters. This can be the case when a script generates file artifacts but does not know the file names in advance. Using the syntax: regex() in the place of a token name (where LabKey server controls the token name to file mapping) will result the following actions:

  • Any script generated files not mapped to a replacement will be evaluated against the file's name using the regex.
  • If a file matches the regex, it will be assigned to the replacement and rendered accordingly.
<replacement>:regex(<expression>)The following example will find all files generated by the script with the extension : '.gct'. If any are found they will be assigned and rendered to the replacement parameter (in this case as a download link).//
#${fileout:regex(.*?(\.gct))}

Cairo or GDD Packages

You may need to use the Cairo or GDD graphics packages in the place of jpeg() and png() if your LabKey Server runs on a "headless" Unix server. You will need to make sure that the appropriate package is installed in R and loaded by your script before calling either of these functions.

GDD() and Cairo() Examples. If you are using GDD or Cairo, you might use the following scripts instead:

library(Cairo);
Cairo(file="${imgout:labkeyl_cairo.png}", type="png");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();

library(GDD);
GDD(file="${imgout:labkeyl_gdd.jpg}", type="jpeg");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();

Deprecated LabKey-specific Syntax

Prior to release 18.1, file substitutions used a LabKey-specific syntax. The use of standard R syntax expected by RStudio as described above means the following syntax is deprecated and should be updated. You can also find this syntax within the R Report UI by selecting the Help tab, then Inline Syntax (Deprecated).

(Deprecated) Valid Substitutions: 
input_dataLabKey Server automatically reads your input dataset (a tab-delimited table) into the data frame called labkey.data. For tighter control over the method of data upload, or to modify the parameters of the read.table function, you can perform the data table upload yourself:
labkey.data <- read.table("${input_data}", header=TRUE);
labkey.data;
imgout: <name>An image output file (such as jpg, png, etc.) that will be displayed as a Section of a report on LabKey Server. The 'imgout:' prefix indicates that the output file is an image and the <name> substitution identifies the unique image produced after you call dev.off(). The following script displays a .png image in a report:
png(filename="${imgout:labkeyl_png}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();
tsvout: <name>A TSV text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. For example:
write.table(labkey.data, file = "${tsvout:tsvfile}", sep = "\t", 
qmethod = "double");
txtout: <name>A text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. A CSV example:
write.csv(labkey.data, file = "${txtout:csvfile}");
pdfout: <name>A PDF output file that can be downloaded from LabKey Server.
pdf(file="${pdfout:labkeyl_pdf}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();
psout: <name>A postscript output file that can be downloaded from LabKey Server.
postscript(file="${psout:labkeyl_eps}", horizontal=FALSE, onefile=FALSE);
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();
fileout: <name>A file output that can be downloaded from LabKey Server, and may be of any file type. For example, use fileout in the place of tsvout to allow users to download a TSV instead of seeing it within the page:
write.table(labkey.data, file = "${fileout:tsvfile}", sep = "\t", qmethod = "double", col.names=NA);
 Another example shows how to send the output of the console to a file:
options(echo=TRUE);
sink(file = "${fileout:consoleoutput.txt}");
labkey.data;
htmlout: <name>A text file that is displayed on LabKey Server as a section within a report. The output is different from the txtout: replacement in that no html escaping is done. This is useful when you have a report that produces html output. No downloadable file is created:
txt <- paste("<i>Click on the link to visit LabKey:</i>
<a target='blank' href='http://www.labkey.org'>LabKey</a>"
)
write(txt, file="${htmlout:output}");
svgout: <name>An svg file that is displayed on LabKey Server as a section within a report. htmlout can be used to render svg outputs as well, however, using svgout will generate a more appropriate thumbnail image for the report. No downloadable file is created:
svg("${svgout:svg}", width= 4, height=3)
plot(x=1:10,y=(1:10)^2, type='b')
dev.off()

(Deprecated) Implicit Variables: 
labkey.dataThe data frame which the input dataset is automatically read into. The code to generate the data frame is:
labkey.data <- read.table("${input_data}", header=TRUE, sep="\t",
quote="", comment.char="")

Additional Reference

Documentation and tutorials about the R language can be found at the R Project website.




Tutorial: Query LabKey Server from RStudio


This tutorial shows you how to pull data directly from LabKey Server into RStudio for analysis and visualization.

Tutorial Steps:

Install RStudio

  • If necessary, install R version 3.0.1 or later. If you already have R installed, you can skip this step.
  • If necessary, install RStudio Desktop on your local machine. If you already have RStudio installed, you can skip this step.

Install Rlabkey Package

  • Open RStudio.
  • On the Console enter the following:
install.packages("Rlabkey")
  • Follow any prompts to complete the installation.

Query Public Data

  • This will auto-generate the R code that queries the Physical Exam data:
library(Rlabkey)

# Select rows into a data frame called 'labkey.data'

labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="PhysicalExam",
viewName="",
colSelect="ParticipantId,ParticipantVisit/Visit,date,weight_kg,temperature_C,systolicBP,diastolicBP,pulse",
colFilter=NULL,
containerFilter=NULL,
colNameOpt="rname"
)
  • Copy this code to your clipboard.
  • Go to RStudio, paste the code into the Console tab, and press the Enter key.
  • The query results will be displayed on the 'labkey.data' tab.
  • Enter the following line into the Console to visualize the data.
plot(labkey.data$systolicbp, labkey.data$diastolicbp)
  • Note that the auto-generated code will include any filters applied to the grid. For example, if you filter Physical Exam for records where temperature is greater than 37, then the auto-generated code will include a colFilter property as below:
library(Rlabkey)

# Select rows into a data frame called 'labkey.data'

labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="PhysicalExam",
viewName="",
colSelect="ParticipantId,ParticipantVisit/Visit,date,weight_kg,temperature_C,systolicBP,diastolicBP,pulse",
colFilter=makeFilter(c("temperature_C", "GREATER_THAN", "37")),
containerFilter=NULL,
colNameOpt="rname"
)

Handling Login/Authentication

  • To query non-public data, that is, data that requires a login/authentication step to access, you have three options:

Query Non-public Data

  • Once you have set up authentication for RStudio, the process of querying is the same as above:
    • Go to the desired data grid. Apply filters if necessary.
    • Auto-generate the R code using Export > Script > R > Create Script.
    • Use that code in RStudio.

Related Topics




FAQs for LabKey R Reports


Overview

This page aims to answer common questions about configuring and using the LabKey Server interface for creating R Reports. Remember, an administrator must install and configure R on LabKey Server before users can create and run R scripts on live datasets.

Topics:

  1. library(), help() and data() don’t work
  2. plot() doesn’t work
  3. jpeg() and png() don’t work
  4. Does my report reflect live, updated data?
  5. Output is not printed when I source() a file or use a function
  6. Scripts pasted from documentation don't work in the LabKey R Script Builder
  7. LabKey Server becomes very, very slow when scripts execute
  8. Does R create security risks?
  9. Any good sources for advice on R scripting?

1. library(), help() and data() don’t work

LabKey Server runs R scripts in batch mode. Thus, on Windows machines it does not display the pop-up windows you would ordinarily see in R’s interpreted/interactive mode. Some functions that produce pop-ups (e.g., library()) have alternatives that output to the console. Some functions (e.g., help() and some forms of data()) do not.

Windows Workaround #1: Use alternatives that output to the console

library(): The library() command has a console-output alternative. To see which packages your administrator has made available, use the following:

installed.packages()[,0]
Windows Workaround #2: Call the function from a native R window

help(): It’s usually easy to keep a separate, native R session open and call help() from there. This works better for some functions than others. Note that you must install and load packages before asking for help() with them. You can also use the web-based documentation available on CRAN.

data(): You can also call data() from a separate, native R session for some purposes. Calling data() from such a session can tell you which datasets are available on any packages you’ve installed and loaded in that instance of R, but not your LabKey installation.

2. plot() doesn’t work

Did you open a graphics device before calling plot()?

LabKey Server executes R scripts in batch mode. Thus, LabKey R never automatically opens an appropriate graphics device for output, as would R when running in interpreted/interactive mode. You’ll need to open the appropriate device yourself. For onscreen output that becomes part of a report, use jpeg() or png() (or their alternatives, Cairo(), GDD() and bitmap()). In order to output a graphic as a separate file, use pdf() or postscript().

Did you call dev.off() after plotting?

You need to call dev.off() when you’re done plotting to make sure the plot object gets printed to the open device.

3. jpeg() and png() don’t work

R is likely running in a headless Unix server. On a headless Unix server, R does not have access to the appropriate X11 drivers for the jpeg() and png() functions. Your admin can install a display buffer on your server to avoid this problem. Otherwise, in each script you will need to load the appropriate package to create these file formats via other functions (e.g., GDD or Cairo). See also: Determine Available Graphing Functions for help getting unstuck.

4. Does my report reflect live, updated data?

Yes. In general, LabKey always re-runs your saved script before displaying its associated report. Your script operates on live, updated data, so its plots and tables reflect fresh data.

In study folders, you can set a flag for any script that prevents the script from being re-run unless changes have occurred. This flag can save time when scripts are time-intensive or datasets are large making processing slow. When this flag is set, LabKey will only re-run the R script if:

  • The flag is cleared OR
  • The dataset associated with the script has changed OR
  • Any of the attributes associated with the script are changed (script source, options etc.)
To set the flag, check the "Automatically cache this report for faster reloading" checkbox under "Study Options" on the Source tab of the R report builder.

5. Output is not printed when I source() a file or use a function

The R FAQ explains:

When you use… functions interactively at the command line, the result is automatically printed...In source() or inside your own functions you will need an explicit print() statement.

When a command is executed as part of a file that is sourced, the command is evaluated but its results are not ordinarily printed. For example, if you call source(scriptname.R) and scriptname.R calls installed.packages()[,0] , the installed.packages()[,0] command is evaluated, but its results are not ordinarily printed. The same thing would happen if you called installed.packages()[,0] from inside a function you define in your R script.

You can force sourced scripts to print the results of the functions they call. The R FAQ explains:

If you type `1+1' or `summary(glm(y~x+z, family=binomial))' at the command line the returned value is automatically printed (unless it is invisible()). In other circumstances, such as in a source()'ed file or inside a function, it isn't printed unless you specifically print it.
To print the value 1+1, use
print(1+1);
or, instead, use
source("1plus1.R", echo=TRUE);
where "1plus1.R" is a shared, saved script that includes the line "1+1".

6. Scripts pasted from documentation don't work in the LabKey R report builder

If you receive an error like this:

Error: syntax error, unexpected SYMBOL, expecting 'n' or ';'
in "library(Cairo) labkey.data"
Execution halted
please check your script for missing line breaks. Line breaks are known to be unpredictably eliminated during cut/paste into the script builder. This issue can be eliminated by ensuring that all scripts have a ";" at the end of each line.

7. LabKey Server becomes very, very slow when scripts execute

You are probably running long, computationally intensive scripts. To avoid a slowdown, run your script in the background via the LabKey pipeline. See R Report Builder for details on how to execute scripts via the pipeline.

8. Does R Create Security Risks?

Allowing the use of R scripts/reports on a server can be a security risk. A developer could write a script that could read or write any file stored in any SiteRoot, fileroot or pipeline root despite the LabKey security settings for that file.

A user must have at least the Author role as well as either the Platform Developer or Trusted Analyst role to write a R script or report to be used on the server.

R should not be used on a "shared server", that is, a server where users with admin/developer privileges in one project do not have permissions on other projects. Running R on the server could pose a security threat if the user attempts to access the server command line directly. The main way to execute a system command in R is via the 'system(<system call>)' method that is part of the R core package. The threat is due to the permission level of a script being run by the server possibly giving unwanted elevated permissions to the user.

Administrators should investigate sandboxing their R configuration as a software management strategy to reduce these risks. Note that the LabKey Server will not enforce or confirm such sandboxing, but offers the admin a way of announcing that an R configuration is safe.

9. Any good sources for advice on R scripting?

R Graphics Basics: Plot area, mar (margins), oma (outer margin area), mfrow, mfcol (multiple figures)

  • Provides good advice how to make plots look spiffy
R graphics overview
  • This powerpoint provides nice visuals for explaining various graphics parameters in R
Bioconductor course materials
  • Lectures and labs cover the range - from introductory R to advanced genomic analysis

10. Graphics File Formats

If you don’t know which graphics file format to use for your plots, this link can help you narrow down your options.

.png and .gif

Graphics shared over the web do best in png when they contain regions of monotones with hard edges (e.g., typical line graphs). The .gif format also works well in such scenarios, but it is not supported in the default R installation because of patent issues. The GDD package allows you to create gifs in R.

.jpeg

Pictures with gradually varying tones (e.g., photographs) are successfully packaged in the jpeg format for use on the web.

.pdf and .ps or .eps

Use pdf or postscript when you aim to output a graph that can be accessed in isolation from your R report.