The LabKey client library for R makes it easy for R users to load live data from a LabKey Server into the R environment for analysis, provided users have permissions to read the data. It also enables R users to insert, update, and delete records stored on a LabKey Server, provided they have appropriate permissions to do so. The Rlabkey APIs use HTTP requests to communicate with a LabKey Server.
Access and Credentials
The Rlabkey library can be used from many locations, including but not limited to:
All requests to the LabKey Server using the R API are performed under the user's account profile, with all proper security enforced on the server. User credentials can be provided via a
netrc file rather than directly in the running R program, making it possible to write reports and scripts that can be shared among many users without compromising security.
Learn more about authentication options in these topics:
If you wish, you can also include authentication credentials (either an apikey or full email and password) directly in your R script by using labkey.setDefaults. Learn more in the
Rlabkey documentation on CRANTroubleshoot common issues using the information in this topic:
Troubleshoot Rlabkey
Documentation
Configuration Steps
Typical configuration steps for a user of Rlabkey include:
- Install R from http://www.r-project.org/
- Install the Rlabkey package once using the following command in the R console. (You may want to change the value of repos depending on your geographical location.)
install.packages("Rlabkey", repos="http://cran.rstudio.com")
- Load the Rlabkey library at the start of every R script using the following command:
- Create a netrc file to set up authentication.
- Necessary if you wish to modify a password-protected LabKey Server database through the Rlabkey macros.
- Note that Rlabkey handles sessionid and authentication internally. Rlabkey passes the sessionid as an HTTP header for all API calls coming from that R session. LabKey Server treats this just as it would a valid JSESSIONID parameter or cookie coming from a browser.
Scenarios
The
Rlabkey package supports the transfer of data between a LabKey Server and an R session.
- Retrieve data from LabKey into a data frame in R by specifying the query schema information (labkey.selectRows and getRows) or by using SQL commands (labkey.executeSql).
- Update existing data from an R session (labkey.updateRows).
- Insert new data either row by row (labkey.insertRows) or in bulk (labkey.importRows) via the TSV import API.
- Delete data from the LabKey database (labkey.deleteRows).
- Use Interactive R to discover available data via schema objects (labkey.getSchema).
For example, you might use an external instance of R to do the following:
- Connect to LabKey Server.
- Use queries to show which schemas and datasets/lists/queries are available within a specific project or sub-folder.
- Create colSelect and colFilter parameters for the labkey.selectRows command on the selected schema and query.
- Retrieve a data frame of the data specified by the current url, folder, schema, and query context.
- Perform transformations on this data frame locally in your instance of R.
- Save the revised data frame back into the desired target on LabKey Server.
Within the LabKey interface, the Rlabkey macros are particularly useful for accessing and manipulating datasets across folders and projects.
Run R Scripts on a Schedule
If you want to configure an R script to run locally on a schedule, consider using an R scheduler package such as:
- cronR (on Mac)
- taskschedulerR (on Windows)
Note that such schedulers will not work for a script defined and running within LabKey Server itself. As an alternative, consider options that will run your script during data import instead:
Premium Feature AvailableSubscribers to premium editions of LabKey Server have the additional option of using ETLs to automate and schedule transformations of data. Learn more in these topics:
If you have an R report that will run in the background and does not need to directly transform data during the ETL, consider adding an R report run task as described here to "kick off" that report on the schedule of your choosing:
Learn more about premium editions
Related Topics