Table of Contents

guest
2022-09-24
   Data Basics
     LabKey Data Structures
     Preparing Data for Import
     Field Editor
       Field Types and Properties
       Text Choice Fields
       URL Field Property
       Conditional Formats
       String Expression Format Functions
       Date & Number Display Formats
         Date and Number Formats Reference
       Lookup Columns
       Protecting PHI Data
     Data Grids
       Data Grids: Basics
       Import Data
       Sort Data
       Filter Data
         Filtering Expressions
       Column Summary Statistics
       Customize Grid Views
       Saved Filters and Sorts
       Select Rows
       Export Data Grid
       Participant Details View
       Query Scope: Filter by Folder
     Reports and Charts
       Jupyter Reports
       Report Web Part: Display a Report or Chart
       Data Views Browser
       Query Snapshots
       Attachment Reports
       Link Reports
       Participant Reports
       Query Reports
       Manage Data Views
         Manage Study Notifications
       Manage Categories
       Manage Thumbnail Images
       Measure and Dimension Columns
       Legacy Reports
         Advanced Reports / External Reports
         Legacy Chart Views
         Deprecated: Crosstab Reports
     Visualizations
       Bar Charts
       Box Plots
       Line Plots
       Pie Charts
       Scatter Plots
       Time Charts
       Column Visualizations
       Quick Charts
       Integrate with Tableau
     Lists
       Tutorial: Lists
         Step 1: Set Up List Tutorial
         Step 2: Create a Joined Grid
         Step 3: Add a URL Property
       Create Lists
       Edit a List Design
       Populate a List
       Manage Lists
       Export/Import a List Archive
     R Reports
       R Report Builder
       Saved R Reports
       R Reports: Access LabKey Data
       Multi-Panel R Plots
       Lattice Plots
       Participant Charts in R
       R Reports with knitr
       Input/Output Substitutions Reference
       Tutorial: Query LabKey Server from RStudio
       FAQs for LabKey R Reports
     Premium RStudio Integration
       Connect to RStudio
         Set Up Docker with TLS
       Connect to RStudio Workbench
         Set Up RStudio Workbench
       Edit R Reports in RStudio
       Export Data to RStudio
       Advanced Initialization of RStudio
     SQL Queries
       LabKey SQL Tutorial
       SQL Query Browser
       Create a SQL Query
       Edit SQL Query Source
       LabKey SQL Reference
       Lookups: SQL Syntax
       LabKey SQL Utility Functions
       Query Metadata
         Query Metadata: Examples
       Edit Query Properties
       Trace Query Dependencies
       Query Web Part
       SQL Synonyms
       LabKey SQL Examples
         JOIN Queries
         Calculated Columns
         Premium Resource: Display Calculated Columns from Queries
         Pivot Queries
         Queries Across Folders
         Parameterized SQL Queries
         More LabKey SQL Examples
     External Schemas and Data Sources
       External MySQL Data Sources
       External Oracle Data Sources
       External Microsoft SQL Server Data Sources
       External PostgreSQL Data Sources
       External SAS Data Sources
       External Redshift Data Sources
       Linked Schemas and Tables
     Controlling Data Scope
     Ontology Integration
       Load Ontologies
       Concept Annotations
       Ontology Column Filtering
       Ontology Lookup
       Ontology SQL
     Data Quality Control
     Quality Control Trend Reports
       Define QC Trend Report
       Use QC Trend Reports
       QC Trend Report Guide Sets
     Search
       Search Administration
     Spotfire Integration
       Premium Resource: Embed Spotfire Visualizations
     LabKey Natural Language Processing (NLP)
       Natural Language Processing (NLP) Pipeline
       Metadata JSON Files
       Document Abstraction Workflow
       Automatic Assignment for Abstraction
       Manual Assignment for Abstraction
       Abstraction Task Lists
       Document Abstraction
       Review Document Abstraction
       Review Multiple Result Sets
       NLP Result Transfer
       Configure LabKey NLP
       Run NLP Pipeline
     Premium Resource: Bulk Editing

Data Basics


LabKey Server lets you explore, interrogate and present your biomedical research data online and interactively in a wide variety of ways. The topics in this section give you a broad overview of data basics.

Topics

An example online data grid:

A scatter plot visualization of the same data:




LabKey Data Structures


LabKey Server offers a wide variety of ways to store and organize data. Different data structure types offer specific features, which make them more or less suited for specific scenarios. You can think of these data structures as "table types", each designed to capture a different kind of research data.

This topic reviews the data structures available within LabKey Server, and offers guidance for choosing the appropriate structure for storing your data.

Where Should My Data Go?

The primary deciding factors when selecting a data structure will be the nature of the data being stored and how it will be used. Information about lab samples should likely be stored as a sample type. Information about participants/subjects/animals over time should be stored as datasets in a study folder. Less structured data may import into LabKey Server faster than highly constrained data, but integration may be more difficult. If you do not require extensive data integration or specialized tools, a more lightweight data structure, such as a list, may suit your needs.

The types of LabKey Server data structures appropriate for your work depend on the research scenarios you wish to support. As a few examples:

  • Management of Simple Tabular Data. Lists are a quick, flexible way to manage ordinary tables of data, such as lists of reagents.
  • Integration of Data by Time and Participant for Analysis. Study datasets support the collection, storage, integration, and analysis of information about participants or subjects over time.
  • Analysis of Complex Instrument Data. Assays help you to describe complex data received from instruments, generate standardized forms for data collection, and query, analyze and visualize collected data.
These structures are often used in combination. For example, a study may contain a joined view of a dataset and an assay with a lookup into a list for names of reagents used.

Universal Table Features

All LabKey data structures support the following features:

  • Interactive, Online Grids
  • Data Validation
  • Visualizations
  • SQL Queries
  • Lookup Fields

Lists

Lists are the simplest and least constrained data type. They are generic, in the sense that the server does not make any assumptions about the kind of data they contain. Lists are not entirely freeform; they are still tabular data and have primary keys, but they do not require participant IDs or time/visit information. There are many ways to visualize and integrate list data, but some specific applications will require additional constraints.

Lists data can be imported in bulk as part of a TSV, or as part of a folder, study, or list archive. Lists also allow row-level insert/update/delete.

Lists are scoped to a single folder, and its child workbooks (if any).

Assays

Assays capture data from individual experiment runs, which usually correspond to an output file from some sort of instrument. Assays have an inherent batch-run-results hierarchy. They are more structured than lists, and support a variety of specialized structures to fit specific applications. Participant IDs and time information are required.

Specific assay types are available, which correspond to particular instruments and offer defaults specific to use of the given assay instrument. Results schema can range from a single, fixed table to many interrelated tables. All assay types allow administrators to configure fields at the run and batch level. Some assay types allow further customization at other levels. For instance, the Luminex assay type allows admins to customize fields at the analyte level and the results level. There is also a general purpose assay type, which allows administrators to completely customize the set of result fields.

Usually assay data is imported from a single data file at a time, into a corresponding run. Some assay types allow for API import as well, or have customized multi-file import pathways. Assays result data may also be integrated into a study by aligning participant and time information, or by sample id.

Assay designs are scoped to the container in which they are defined. To share assay designs among folders or subfolders, define them in the parent folder or project, or to make them available site-wide, define them in the Shared project. Run and result data can be stored in any folder in which the design is in scope.

Datasets

Clinical Datasets are designed to capture the variable characteristics of an organism over time, like blood pressure, mood, weight, and cholesterol levels. Anything you measure at multiple points in time will fit well in a Clinical Dataset.

Datasets are always part of a study. They have two required fields:

  • ParticipantId (this name may vary) - Holds the unique identifier for the study subject.
  • Date or Visit (the name may vary) - Either a calendar date or a number.
There are different types of datasets with different cardinality, also known as data row uniqueness:
  • Demographic: Zero or one row for each subject. For example, each participant has only one enrollment date.
  • “Standard”/"Clinical": Can have multiple rows per subject, but zero or one row for each subject/timepoint combination. For example, each participant has exactly one weight measurement at each visit.
  • “Extra key”/"Assay": can have multiple rows for each subject/timepoint combination, but have an additional field providing uniqueness of the subject/timepoint/arbitrary field combination. For example, many tests might be run each a blood sample collected for each participant at each visit.
Datasets have special abilities to automatically join/lookup to other study datasets based on the key fields, and to easily create intelligent visualizations based on these sorts of relationships.

A dataset can be backed by assay data that has been copied to the study. Behind the scenes, this consists of a dataset with rows that contain the primary key (typically the participant ID) of the assay result data, which is looked up dynamically.

Non-assay datasets can be imported in bulk (as part of a TSV paste or a study import), and can also be configurable to allow row-level inserts/updates/deletes.

Datasets are typically scoped to a single study in a single folder. In some contexts, however, shared datasets can be defined at the project level and have rows associated with any of its subfolders.

Datasets have their own study security configuration, where groups are granted access to datasets separately from their permission to the folder itself. Permission to the folder is a necessary prerequisite for dataset access (i.e., have the Reader role for the folder), but is not necessarily sufficient.

A special type of dataset, the query snapshot, can be used to extract data from some other sources available in the server, and create a dataset from it. In some cases, the snapshot is automatically refreshed after edits have been made to the source of the data. Snapshots are persisted in a physical table in the database (they are not dynamically generated on demand), and as such they can help alleviate performance issues in some cases.

Custom Queries

A custom query is effectively a non-materialized view in a standard database. It consists of LabKey SQL, which is exposed as a separate, read-only query/table. Every time the data in a custom query is used, it will be re-queried from the database.

In order to run the query, the current user must have access to the underlying tables it is querying against.

Custom queries can be created through the web interface in the schema browser, or supplied as part of a module.

Sample Types

Sample types allow administrators to create multiple sets of samples in the same folder, which each have a different set of customizable fields.

Sample types are created by pasting in a TSV of data and identifying one, two, or three fields that comprise the primary key. Subsequent updates can be made via TSV pasting (with options for how to handle samples that already exist in the set), or via row-level inserts/updates/deletes.

Sample types support the notion of one or more parent sample fields. When present, this data will be used to create an experiment run that links the parent and child samples to establish a derivation/lineage history. Samples can also have "parents" of other dataclasses, such as a "Laboratory" data class indicating where the sample was collected.

One sample type per folder can be marked as the “active” set. Its set of columns will be shown in Customize Grid when doing a lookup to a sample table. Downstream assay results can be linked to the originating sample type via a "Name" field -- for details see Samples.

Sample types are resolved based on the name. The order of searching for the matching sample type is: the current folder, the current project, and then the Shared project. See Shared Project.

DataClasses

DataClasses can be used to capture complex lineage and derivation information, for example, the derivations used in bio-engineering systems. Examples include:

  • Reagents
  • Gene Sequences
  • Proteins
  • Protein Expression Systems
  • Vectors (used to deliver Gene Sequences into a cell)
  • Constructs (= Vectors + Gene Sequences)
  • Cell Lines
You can also use dataclasses to track the physical and biological Sources of samples.

Similarities with Sample Types

A DataClass is similar to a Sample Type or a List, in that it has a custom domain. DataClasses are built on top of the exp.Data table, much like Sample Types are built on the exp.Materials table. Using the analogy syntax:

SampleType : exp.Material :: DataClass : exp.Data

Rows from the various DataClass tables are automatically added to the exp.Data table, but only the Name and Description columns are represented in exp.Data. The various custom columns in the DataClass tables are not added to exp.Data. A similar behavior occurs with the various Sample Type tables and the exp.Materials table.

Also like Sample Types, every row in a DataClass table has a unique name, scoped across the current folder. Unique names can be provided (via a Name or other ID column) or generated using a naming pattern.

For more information, see Data Classes.

Domains

A domain is a collection of fields. Lists, Datasets, SampleTypes, DataClasses, and the Assay Batch, Run, and Result tables are backed by an LabKey internal datatype known as a Domain. A Domain has:

  • a name
  • a kind (e.g. "List" or "SampleType")
  • an ordered set of fields along with their properties.
Each Domain type provides specialized handling for the domains it defines. The number of domains defined by a data type varies; for example, Assays define multiple domains (batch, run, etc.), while Lists and Datasets define only one domain each.

The fields and properties of a Domain can be edited interactively using the field editor or programmatically using the JavaScript LABKEY.Domain APIs.

Also see Modules: Domain Templates.

External Schemas

External schemas allow an administrator to expose the data in a “physical” database schema through the web interface, and programmatically via APIs. They assume that some external process has created the schemas and tables, and that the server has been configured to connect to the database, via a database connection config in the labkey.xml Tomcat deployment descriptor or its equivalent.

Administrators have the option of exposing the data as read-only, or as insert/update/delete. The server will auto-populate standard fields like Modified, ModifiedBy, Created, CreatedBy, and Container for all rows that it inserts or updates. The standard bulk option (TSV, etc) import options are supported.

External schemas are scoped to a single folder. If an exposed table has a “Container” column, it will be filtered to only show rows whose values match the EntityId of the folder.

The server can connect to a variety of external databases, including Oracle, MySQL, SAS, Postgres, and SQLServer. The schemas can also be housed in the standard LabKey Server database.

The server does not support cross-database joins. It can do lookups (based on single-column foreign keys learned via JDBC metadata, or on XML metadata configuration) only within a single database though, regardless of whether it’s the standard LabKey Server database or not.

Linked Schemas

Linked schemas allow you to expose data in a target folder that is backed by some other data in a different source folder. These linked schemas are always read-only.

This provides a mechanism for showing different subsets of the source data in a different folder, where the user might not have permission (or need) to see everything else available in the source folder.

The linked schema configuration, set up by an administrator, can include filters such that only a portion of the data in the source schema/table is exposed in the target.

Specimens (Deprecated)

Specimens are one way to include sample-related data as part of a study.

Beginning with release 21.3, support for specimen functionality has been moved out of the study module and is only available with Premium Editions. Instead of using specimens, store sample data in Sample Types.

Specimen data consists of multiple tables, including vials, specimens, primary type, etc. In addition to the required fields, administrators can customize the optional fields or add new ones for the specimens themselves. Specimens can be loaded in bulk as part of a study or specimen import. It is possible to enable editing of specimens directly through the web UI as well, but this is not common.

Specimens support additional workflows around the creation, review, and approval of specimen requests to coordinate cross-site collaboration over a shared specimen repository. A specimen request workflow is supported allowing a network of researchers to access your vials for additional uses.

The configuration for specimens is scoped to a single folder. Only one set of specimen configuration is supported per folder.

Learn more about using specimens in this topic: Specimen Tracking (Legacy)

Related Topics




Preparing Data for Import


This topic explains how to best prepare your data for import so you can meet any requirements set up by the target data structure.

LabKey Server provides a variety of different data structures for different uses: Assay Designs for capturing instrument data, Datasets for integrating heterogeneous clinical data, Lists for general tabular data, etc. Some of these data structures place strong constraints on the nature of the data to be imported, for example Datasets make uniqueness constraints on the data; other data structures, such as Lists, make few assumptions about incoming data.

Choose Column Data Types

When deciding how to import your data, consider the data type of columns to match your current and future needs. Considerations include:

  • Available Types: Review the list at the top of this topic.
  • Type Conversions are possible after data is entered, but only within certain compatibilities. See the table of available type changes here.
  • Number type notes:
    • Integer: A 4-byte signed integer that can hold values ranging -2,147,483,648 to +2,147,483,647.
    • Decimal (Floating Point): An 8-byte double precision floating point number that can hold very large and very small values. Values can range approximately 1E-307 to 1E+308 with a precision of at least 15 digits. As with most standard floating point representations, some values cannot be converted exactly and are stored as approximations. It is often helpful to set on Decimal fields a display format that specifies a fixed or maximum number of decimal places to avoid displaying approximate values.

General Advice: Avoid Mixed Data Types in a Column

LabKey tables (Lists, Datasets, etc.) are implemented as database tables. So your data should be prepared for insertion into a database. Most importantly, each column should conform to a database data type, such as Text, Integer, Decimal, etc. Mixed data in a column will be rejected when you try to upload it.

Wrong

The following table mixes Boolean and String data in a single column.

ParticipantIdPreexisting Condition
P-100True, Edema
P-200False
P-300True, Anemia

Right

Split out the mixed data into separate columns

ParticipantIdPreexisting ConditionCondition Name
P-100TrueEdema
P-200False 
P-300TrueAnemia

General Advice: Avoid Special Characters in Column Headers

Column names should avoid special characters such as !, @, #, $, etc. Column names should contain only letters, numbers, spaces, and underscores; and should begin only with a letter or underscore. We also recommend underscores instead of spaces.

Wrong

The following table has special characters in the column names.

Participant #Preexisting Condition?
P-100True
P-200False
P-300True

Right

The following table removes the special characters and replaces spaces with underscores.

Participant_NumberPreexisting_Condition
P-100True
P-200False
P-300True

Handling Backslash Characters

If you have a TSV or CSV file that contains backslashes in any text field, the upload will likely ignore the backslash and either substitute it for another value or ignore it completely. This occurs because the server treats backslashes as escape characters, for example, \n will insert a new line, \t will insert a tab, etc.

To import text fields that contain backslashes, wrap the values in quotes, either by manual edit or by changing the save settings in your editor. For example:

Test\123
should be:
"Test\123"

Note that this does not apply to importing Excel files. Excel imports are handled differently, and text data within the cell is processed such that text is in quotes.

Data Aliasing

Use data aliasing to work with non-conforming data, meaning the provided data has different columns names or different value ids for the same underlying thing. Examples include:

  • A lab provides assay data which uses different participant ids than those used in your study. Using different participant ids is often desirable and intentional, as it provides a layer of PHI protection for the lab and the study.
  • Excel files have different column names for the same data, for example some files have the column "Immune Rating" and other have the column "Immune Score". You can define an arbitrary number of these import aliases to map to the same column in LabKey.
  • The source files have a variety of names for the same visit id, for example, "M1", "Milestone #1", and "Visit 1".

Clinical Dataset Details

Datasets are intended to capture measurements events on some subject, like a blood pressure measurement or a viral count at some point it time. So datasets are required to have two columns:

  • a subject id
  • a time point (either in the form of a date or a number)
Also, a subject cannot have two different blood pressure readings at a given point in time, so datasets reflect this fact by having uniqueness constraints: each record in a dataset must have a unique combination of subject id plus a time point.

Wrong

The following dataset has duplicate subject id / timepoint combinations.

ParticipantIdDateSystolicBloodPressure
P-1001/1/2000120
P-1001/1/2000105
P-1002/2/2000110
P-2001/1/200090
P-2002/2/200095

Right

The following table removes the duplicate row.

ParticipantIdDateSystolicBloodPressure
P-1001/1/2000120
P-1002/2/2000110
P-2001/1/200090
P-2002/2/200095

Demographic Dataset Details

Demographic datasets have all of the constraints of clinical datasets, plus one more: a given subject identifier cannot appear twice in a demographic dataset.

Wrong

The following demographic dataset has a duplicate subject id.

ParticipantIdDateGender
P-1001/1/2000M
P-1001/1/2000M
P-2001/1/2000F
P-3001/1/2000F
P-4001/1/2000M

Right

The following table removes the duplicate row.

ParticipantIdDateGender
P-1001/1/2000M
P-2001/1/2000F
P-3001/1/2000F
P-4001/1/2000M

Merge List and Dataset Data

To merge data for a List or Dataset, choose any bulk import method and check the box to Update data for existing rows during import. Otherwise, if the import includes data for existing rows, it will fail. This option is not supported for Lists with auto-incrementing integer keys.

Learn more about importing lists and datasets in these topics:

Import to Unrecognized Fields

When importing data, if there are unrecognized fields in your spreadsheet, meaning fields that are not included in the data structure definition, they will be ignored. In some situations, you will see a warning banner explaining that this is happening:

Date Parsing Considerations

Whether to parse user-entered or imported date values as Month-Day-Year (as typical in the U.S.) or Day-Month-Year (as typical outside the U.S.) is set at the site level. For example, 11/7/2020 is either July 11 or November 7; value like 11/15/2020 is only valid Month-Day-Year (US parsing).

If you attempt to import date values with the "wrong" format, they will be interpreted as strings, so you will see errors similar to ""Could not convert value '11/15/20' (String) for Timestamp field 'FieldName'."

If you need finer grained control of the parsing pattern, such as to include specific delimiters or other formatting, you can specify additional parsing patterns at the site, project, or folder level.

Data Import Previews

In some contexts, such as creating a list definition and populating it at the same time and importing samples from file into Sample Manager or LabKey Biologics, you will see a few lines "previewing" the data you are importing.

These data previews shown in the user interface do not apply field formatting from either the source spreadsheet or the destination data structure.

In particular, when you are importing Date and DateTime fields, they are always previewed in ISO format (yyyy-MM-dd hh:mm) regardless of source or destination formatting. The Excel format setting is used to infer that this is a date column, but is not carried into the previewer or the imported data.

For example, if you are looking at a spreadsheet in Excel, you may see the value with specific date formatting applied, but this is not how Excel actually stores the date value in the field. In the image below, the same 'DrawDate' value is shown with different Excel formatting applied. Date values are a numeric offset from a built-in start date. To see the underlying value stored, you can view the cell in Excel with 'General' formatting applied (as shown in the fourth line of the image below), though note that 'General' formatted cells will not be interpreted as date fields by LabKey. LabKey uses any 'Date' formatting to determine that the field is of type 'DateTime', but then all date values are shown in ISO format in the data previewer, i.e. here "2022-03-10 00:00".

After import, display format settings on LabKey Server will be used to display the value as you intend. Learn about setting display values for dates in this topic: Date & Number Display Formats

OOXML Documents Not Supported

Excel files saved in the "Strict Open XML Spreadsheet" format will generate an .xlsx file, however this format is not supported by POI, the library LabKey uses for parsing .xlsx files. When you attempt to import this format into LabKey, you will see the error:

There was a problem loading the data file. Unable to open file as an Excel document. "Strict Open XML Spreadsheet" versions of .xlsx files are not supported.

As an alternative, open such files in Excel and save as the ordinary "Excel Workbook" format.

Learn more about this lack of support in this Apache issue.

Related Topics




Field Editor


This topic covers using the Field Editor, the user interface for defining and editing the fields that represent columns in a grid of data. The set of fields, also known as the schema or "domain", describes the shape of data. Each field, or column, has a main field type and can also have various properties and settings to control display and behavior of the data in that column.

The field editor is a common tool used by many different data structures in LabKey. The method for opening the field editor varies by data structure, as can the types of field available. Specific details about fields of each data type are covered in the topic: Field Types and Properties.

Topics:

Open the Field Editor

How you open the field editor depends on the type of data structure you are editing.

Data StructureOpen for Editing Fields
ListOpen the list and click Design in the grid header
DatasetIn the grid header, click Manage, then Edit Definition, then the Fields section
Assay DesignSelect Manage Assay Design > Edit Assay Design, then click the relevant Fields section
Sample TypeClick the name of the type and then Edit Type
Data ClassClick the name of the class and then Edit
Study PropertiesGo to the Manage tab and click Edit Additional Properties
Specimen PropertiesUnder Specimen Repository Settings, click Edit Specimen/Vial/Specimen Event Fields
User PropertiesGo to (Admin) > Site > Site Users and click Change User Properties
Issue DefinitionsViewing the issue list, click Admin.
Query MetadataGo to (Admin) > Go To Module > Query, select the desired schema and query, then click Edit Metadata

If you are learning to use the Field Editor and do not yet have a data structure to edit, you can get started by defining some additional study properties as shown in the walkthrough of this topic. By default, there are no additional properties predefined in a study.

  • Follow the instructions in this topic to install an example study if you don't already have a study to work with.
  • In your study, click the Manage tab.
  • Click Edit Additional Properties.

Create New Fields

To use the Field Editor to create a new set of fields and their properties, you can follow this example for defining additional study properties.

  • Open your data structure for editing.
  • To get started, decide how you want to define the first fields:
    • Import or infer fields from file: The supported formats are shown below the upload panel.
      • In the case of study properties, shown in our example here, inferral is not available so the link reads "Import fields from file" instead.
    • Manually Define Fields: Click the button to open the manual field editor.

Both options are available when the set of fields is empty. Once you have defined some fields by either method, the manual editor is the only option for adding more fields.

Import or Infer Fields from File

  • When the set of fields is empty, you can import (or infer) new fields from a file.
    • The range of file formats supported depends on the data structure. Some can infer fields from a data file; all support import from a JSON file that contains only the field definitions.
    • In the case of additional properties, shown above, JSON is the only accepted format.
    • When a JSON file is imported, all valid properties for the given fields will be applied. See below.
  • Click to select or drag and drop a file of an accepted format into the panel.
  • The fields inferred from the file will be shown in the manual field editor, where you may fine tune them or save them as is.
    • Note that if your file includes columns for reserved fields, they will not be shown as inferred. Reserved field names vary by data structure and will always be created for you.

Manually Define Fields

  • After clicking the Manually Define Fields button, the panel will change to show the manual field editor.
  • Depending on the type of data structure, and whether you started by importing some fields, you may see a first "blank" field ready to be defined.
  • If not, click Add Field to add one.
  • Give the field a Name. If you enter a field name with a space in it, you will be warned that SQL queries, R scripts, and other code are easiest to write when field names only contain combination of letters, numbers, and underscores, and start with a letter or underscore.
  • Use the drop down menu to select the Data Type.
    • The data types available vary based on the data structure. Learn which structures support which types here.
    • Each data type can have a different collection of properties you can set.
    • Once you have saved fields, you can only make limited changes to the type.
  • You can use the checkbox if you want to make it required that that field have a value in every row.

Edit Fields

To edit fields, reopen the editor and make the changes you need. If you attempt to navigate away with unsaved changes you will have the opportunity to save or discard them. When you are finished making changes, click Save.

Once you have saved a field or set of fields, you can change the name and most options and other settings. However, you can only make limited changes to the type of a field. Learn about specific type changes in this section.

Rearrange Fields

To change field order, drag and drop the rows using the six-block handle on the left.

Delete Fields

Use the selection checkboxes to select the fields you want to delete. You can use the box at the top of the column to select all fields, and once any are selected you will also see Clear buttons.

Click Delete to delete the selected fields.

You will be asked to confirm the deletion and reminded that all data in a deleted field will be deleted as well. Deleting a field cannot be undone.

Remove a Single Field

To remove one field, click the icon. It is available in both the collapsed and expanded view of each field and will turn red when you hover.

If you have not yet saved the set of fields, i.e. if you are in the process of creating a domain and make a mistake, clicking the icon will immediately delete the field.

If you have already saved the set of fields, you will be asked to confirm the deletion and reminded that all data in a deleted field will be deleted as well. Removing a field cannot be undone.

Click Save when finished.

Edit Field Properties and Options

Each field has a data type and can have additional properties defined. The properties available vary based on the field type. Learn more in this topic: Field Types and Properties

To open the properties for a field, click the icon (it will become a handle for closing the panel). For example, the panel for a text field looks like:

Fields of all types have:

Most field types have a section for Type-specific Field Options: The details of these options as well as the specific kinds of conditional formatting and validation available for each field type are covered in the topic: Field Types and Properties.

Name and Linking Options

All types of fields allow you to set the following properties:

  • Description: An optional text description. This will appear in the hover text for the field you define. XML schema name: description.
  • Label: Different text to display in column headers for the field. This label may contain spaces. The default label is the Field Name with camelCasing indicating separate words. For example, the field "firstName" would by default be labelled "First Name". If you wanted to show the user "Given Name" for this field instead, you would add that string in the Label field.
  • Import Aliases: Define alternate field names to be used when importing from a file to this field. This option offers additional flexibility of recognizing an arbitrary number of source data names to map to the same column of data in LabKey. Multiple aliases may be separated by spaces or commas. To define an alias that contains spaces, use double-quotes (") around it. Learn more about import aliases.
  • URL: Use this property to change the display of the field value within a data grid into a link. Multiple formats are supported, which allow ways to easily substitute and link to other locations in LabKey. The ${ } syntax may be used to substitute another field's value into the URL. Learn more about using URL Formatting Options.
  • Ontology Concept: (Premium Feature) In premium editions of LabKey Server, you can specify an ontology concept this field represents. Learn more in this topic: Concept Annotations

Conditional Formatting and Validation Options

All fields offer conditional formatting criteria. String-based fields offer regular expression validation. Number-based fields offer range expression validation. Learn about options supported for each type of field here.

Create Conditional Format Criteria

  • Click Add Format to open the conditional format editor popup.
  • Specify one or two Filter Type and Filter Value pairs.
  • Select Display Options for how to show fields that meet the formatting criteria:
    • Bold
    • Italic
    • Strikethrough
    • Text/Fill Colors: Choose from the picker (or type into the #000000 area) to specify a color to use for either or both the text and fill.
    • You will see a cell of Preview Text for a quick check of how the colors will look.
  • Add an additional format to the same field by clicking Add Formatting. A second panel will be added to the popup.
  • When you are finished defining your conditional formats, click Apply.

Create Regular Expression Validator

  • Click Add Regex to open the popup.
  • Enter the Regular Expression that this field's value will be evaluated against. All regular expressions must be compatible with Java regular expressions as implemented in the Pattern class.
  • Description: Optional description.
  • Error Message: Enter the error message to be shown to the user when the value fails this validation.
  • Check the box for Fail validation when pattern matches field value in order to reverse the validation: With this box unchecked (the default) the pattern must match the expression. With this box checked, the pattern may not match.
  • Name: Enter a name to identify this validator.
  • You can use Add Regex Validator to add a second condition. The first panel will close and show the validator name you gave. You can reopen that panel using the (pencil) icon.
  • Click Apply when your regex validators for this field are complete.
  • Click Save.

Create Range Expression Validator

  • Click Add Range to open the popup.
  • Enter the First Condition that this field's value will be evaluated against. Select a comparison operator and enter a value.
  • Optionally enter a Second Condition.
  • Description: Optional description.
  • Error Message: Enter the error message to be shown to the user when the value fails this validation.
  • Name: Enter a name to identify this validator.
  • You can use Add Range Validator to add a second condition. The first panel will close and show the validator name you gave. You can reopen that panel using the (pencil) icon.
  • Click Apply when your range validators for this field are complete.
  • Click Save.

Advanced Settings for Fields

Open the editing panel for any field and click Advanced Settings to access even more options:

  • Display Options: Use the checkboxes to control how and in which contexts this field will be available.
    • Show field on default view of the grid
    • Show on update form when updating a single row of data
    • Show on insert form when updating a single row of data
    • Show on details page for a single row

  • Default Value Options: Automatically supply default values when a user is entering information. Not available for Sample Type fields or for Assay Result fields.
    • Default Type: How the default value for the field is determined. Options:
      • Last entered: (Default) If a default value is provided (see below), it will be entered and editable for the user's first use of the form. During subsequent uploads, the user will see their last entered value.
      • Editable default: An editable default value will be entered for the user. The default value will be the same for every user for every upload.
      • Fixed value: Provides a fixed default value that cannot be edited.
    • Default Value: Click Set Default Values to set default values for all the fields in this section.
      • Default values are scoped to the individual folder. In cases where field definitions are shared, you can manually edit the URL to set default values in the specific project or folder.
      • Note that setting default values will navigate you to a new page. Save any other edits to fields before leaving the field editor.

  • Miscellaneous Options:
    • PHI Level: Use the drop down to set the Protected Health Information (PHI) level of data in this field.
    • Exclude from "Participant Date Shifting" on export/publication: (Date fields only) If the option to shift/randomize participant date information is selected during study folder export or publishing of a study, do not shift the dates in this field. Use caution when protecting PHI to ensure you do not exempt fields you intend to shift.
    • Make this field available as a measure: Check the box for fields that contain data to be used for charting and other analysis. These are typically numeric results. Learn more about using Measures and Dimensions for analysis.
    • Make this field available as a dimension: (Not available for Date fields) Check the box for fields to be used as 'categories' in a chart. Dimensions define logical groupings of measures and are typically non-numerical, but may include numeric type fields. Learn more about using Measures and Dimensions for analysis.
    • Make this field a recommended variable: Check the box to indicate that this is an important variable. These variables will be displayed as recommended when creating new participant reports.
    • Track reason for missing data values: Check this box to enable the field to hold special values to indicate data that has failed review or was originally missing. Administrators can set custom Missing Value indicators at the site and folder levels. Learn more about using Missing Value Indicators.

View Fields in Summary Mode

In the upper right of the field editor, a slider marked Detail mode will be shown when you can open individual field panels to edit details. Click the slider to switch to Summary Mode.

In Summary Mode, you see a grid of fields and properties, and a simplified interface. Scroll for more columns. Instead of having to expand panels to see things like whether there is a URL or formatting associated with a given field, the summary grid makes it easier to scan and search large sets of fields at once.

You can add new fields, delete selected fields, and export fields while in summary mode. Click the slider to switch back to Detail mode to rearrange fields or edit field properties.

Export Sets of Fields (Domains)

Once you have defined a set of fields (domain) that you want to be able to save or reuse, you can export some or all of the fields by clicking (Export). If you have selected a subset of fields, only the selected fields will be exported. If you have not selected fields (or have selected all fields) all fields will be included in the export.

A Fields_*.fields.json file describing your fields as a set of key/value pairs will be downloaded. All properties that can be set for a field in the user interface will be included in the exported file contents. For example, the basic text field named "FirstField" we show above looks like:

[
{
"conditionalFormats": [],
"defaultValueType": "FIXED_EDITABLE",
"dimension": false,
"excludeFromShifting": false,
"hidden": false,
"lookupContainer": null,
"lookupQuery": null,
"lookupSchema": null,
"measure": false,
"mvEnabled": false,
"name": "FirstField",
"propertyValidators": [],
"recommendedVariable": false,
"required": false,
"scale": 4000,
"shownInDetailsView": true,
"shownInInsertView": true,
"shownInUpdateView": true,
"isPrimaryKey": false,
"lockType": "NotLocked"
}
]

You can save this file to import a matching set of fields elsewhere, or edit it offline to reimport an adjusted set of fields.

For example, you might use this process to create a dataset in which all fields were marked as measures. Instead of having to open each field's advanced properties in the user interface, you could find and replace the "measure" setting in the JSON file and then create the dataset with all fields set as measures.

Note that importing fields from a JSON file is only supported when creating a new set of fields. You cannot apply property settings to existing data with this process.

Related Topics




Field Types and Properties


The set of fields that make up a data structure like a list, dataset, assay, etc can be edited using the Field Editor interface. You will find instructions about using the field editor and using properties, options, and settings common to all field types in the topic: Field Editor

This topic outlines the field formatting and properties specific to each data type, as well as details about advanced properties for any field.

Field Types Available by Data Structure

Fields come in different types, each intended to hold a different kind of data. Once defined, there are only limited ways you can change a field's type, based on the ability to convert existing data to the new type. To change a field type, you may need to delete and recreate it, reimporting any data.

The following table show which fields are available in which kind of table/data structure. Notice that Datasets do not support Attachment fields. For a workaround technique, see Linking Data Records to Image Files.

Field TypeDatasetListSample TypeAssay DesignData Class
Text (String)YesYesYesYesYes
Text ChoiceYesYesYesYesYes
Multi-Line TextYesYesYesYesYes
BooleanYesYesYesYesYes
IntegerYesYesYesYesYes
Decimal (Floating Point)YesYesYesYesYes
DateTimeYesYesYesYesYes
FlagYesYesYesYesYes
FileYesNoYesYesNo
AttachmentNo (workaround)YesNoNoYes
UserYesYesYesYesYes
Subject/Participant(String)YesYesYesYesYes
LookupYesYesYesYesYes
SampleYesYesYesYesYes
Ontology LookupYesYesYesYesYes
Visit Date/Visit IDNoNoYesNoNo
Unique IDNoNoYesNoNo

The SMILES field type is only available in the Compounds data class when using LabKey Biologics LIMS.

Field Type Changes

Once defined, there are only limited ways you can change a field's data type, based on the ability to safely convert existing data to the new type. This table lists the changes supported (not all types are available in all data structures and contexts):

Current TypeMay Be Changed To:
Text (String)Text Choice, Multi-line Text, Flag, Lookup, Ontology Lookup, Subject/Participant
Text ChoiceText, Multi-Line Text, Flag, Lookup, Ontology Lookup, Subject/Participant
Multi-Line TextText, Text Choice, Flag, Subject/Participant
BooleanText, Multi-Line Text
IntegerDecimal, Lookup, Text, Multi-Line Text, Sample, User
Decimal (Floating Point)Text, Multi-Line Text
DateTimeText, Multi-Line Text, Visit Date
FlagText, Text Choice, Multi-Line Text, Lookup, Ontology Lookup, Subject/Participant
FileText, Multi-Line Text
AttachmentText, Multi-Line Text
UserInteger, Decimal, Lookup, Text, Multi-Line Text, Sample
Subject/Participant(String)Flag, Lookup, Text, Text Choice, Multi-Line Text, Ontology Lookup
LookupText, Multi-Line Text, Integer, Sample, User
SampleText, Multi-Line Text, Integer, Decimal, Lookup, User
Ontology LookupText, Text Choice, Multi-Line Text, Flag, Lookup, Subject/Participant
Visit DateText, Multi-Line Text, Date Time
Visit IDText, Multi-Line Text, Decimal
Unique IDFlag, Lookup, Multi-Line Text, Ontology Lookup, Subject/Participant, Text, Text Choice

If you cannot make your desired type change using the field editor, you may need to delete and recreate the field entirely, reimporting any data.

Validators Available by Field Type

This table summarizes which formatting and validators are available for each type of field.

Field TypeConditional FormattingRegex ValidatorsRange Validators
Text (String)YesYesNo
Text ChoiceYesNoNo
Multi-Line TextYesYesNo
BooleanYesNoNo
IntegerYesNoYes
Decimal (Floating Point)YesNoYes
DateTimeYesNoYes
FlagYesYesNo
FileYesNoNo
AttachmentYesNoNo
UserYesNoYes
Subject/Participant(String)YesYesNo
LookupYesNoYes
SampleYesNoNo
Ontology LookupYesYesNo
Visit Date/Visit IDYesNoNo
Unique IDYesNoNo

Type-Specific Properties and Options

The basic properties and validation available for different types of fields are covered in the main field editor topic. Details for specific types of fields are covered here.

Text/Multi-Line Text/Flag Options

Fields of type Text, Multi-Line Text, and Flag have the same set of properties and formats available:

When setting the Maximum Text Link, consider that when you make a field UNLIMITED, it is stored in the database as 'text' instead of the more common 'varchar(N)'. When the length of a string exceeds a certain amount, the text is stored out of the row itself. In this way, you get the illusion of an infinite capacity. Because of the differences between 'text' and 'varchar' columns, these are some good general guidelines:
  • UNLIMITED is good if:
    • You need to store large text (like, a paragraph or more)
    • You don’t need to index the column
    • You will not join on the contents of this column
    • Examples: blog comments, wiki pages, etc.
  • No longer than... (i.e. not-UNLIMITED) is good if:
    • You're storing smaller strings
    • You'll want them indexed, i.e. you plan to search on string values
    • You want to do a sql select or join on the text in this column
    • Examples would be usernames, filenames, etc.

Text Choice Options

A Text Choice field lets you define a set of values that will be presented to the user as a dropdown list. This is similar to a lookup field, but does not require a separate list created outside the field editor. Click Add Values to open a panel where you can add the choices for this field.

Learn more about defining and using Text Choice fields in this topic: Note that because administrators can always see all the values offered for a Text Choice field, they are not a good choice for storing PHI or other sensitive information. Consider instead using a lookup to a protected list.

Boolean Options

  • Boolean Field Options: Format for Boolean Values: Use boolean formatting to specify the text to show when a value is true and false. Text can optionally be shown for null values. For example, "Yes;No;Blank" would output "Yes" if the value is true, "No" if false, and "Blank" for a null value.
  • Name and Linking Options
  • Conditional Formatting and Validation Options: Conditional formats are available.

Integer/Decimal Options

Integer and Decimal fields share similar number formatting options, shown below. When considering which type to choose, keep in mind the following behavior:

  • Integer: A 4-byte signed integer that can hold values ranging -2,147,483,648 to +2,147,483,647.
  • Decimal (Floating Point): An 8-byte double precision floating point number that can hold very large and very small values. Values can range approximately 1E-307 to 1E+308 with a precision of at least 15 digits. As with most standard floating point representations, some values cannot be converted exactly and are stored as approximations. It is often helpful to set on Decimal fields a display format that specifies a fixed or maximum number of decimal places to avoid displaying approximate values.
Both Integer and Decimal columns have these formatting options available:

Date Time Options

File and Attachment Options

File and Attachment fields are only available in specific scenarios, and have display, thumbnail, and storage differences.


  • File
    • The File field type is only available for certain types of table, including datasets, assay designs, and sample types.
    • When a file has been uploaded into this field, it displays a link to the file; for image files, an inline thumbnail is shown.
    • The uploaded file is stored in the file repository, in the assaydata folder in the case of an assay.
    • For Standard Assays, the File field presents special behavior for image files; for details see Linking Assays with Images and Other Files.
    • When a PDF file is stored in a File field, clicking it will download it.
  • Attachment
    • The Attachment field type is similar to File, but only available for lists and data classes (including Sources for samples).
    • This type allows you to attach documents to individual records in a list or data class.
    • For instance, an image could be associated with a given row of data in an attachment field, and would show an inline thumbnail.
    • The attachment is not uploaded into the file repository, but is stored as a BLOB field in the database.
    • When a PDF file is stored in an Attachment field, clicking it will open it in a new browser tab.
    • By default, the maximum attachment size is 50MB, but this can be changed in the Admin Console using the setting Maximum file size, in bytes, to allow in database BLOBs. See Site Settings.
Learn about using File and Attachment fields in Sample Manager and LabKey Biologics in this topic:

Inline Thumbnails for Files and Attachments

When a field of type File or Attachment is an image, such as a .png or .jpg file, the cell in the data grid will display a thumbnail of the image. Hovering reveals a larger version.

When you export a grid containing these inline images to Excel, the thumbnails remain associated with the cell itself.

Bulk Import into the File Field Type

You can bulk import data into the File field type in LabKey Server, provided that the files/images are already uploaded to the File Repository. For example suppose you already have a set of images in the file Repository, as shown below.

You can load these images into a File field, if you refer to the images by their full server path in the File Repository. For example, the following shows how an assay upload might refer to these images by their full server path:

ImageNameImageFile
10001.pnghttp://localhost:8080/labkey/_webdav/Tutorials/List%20Tutorial/%40files/NIMH/Images/10001.png
10002.pnghttp://localhost:8080/labkey/_webdav/Tutorials/List%20Tutorial/%40files/NIMH/Images/10002.png
10003.pnghttp://localhost:8080/labkey/_webdav/Tutorials/List%20Tutorial/%40files/NIMH/Images/10003.png
10004.pnghttp://localhost:8080/labkey/_webdav/Tutorials/List%20Tutorial/%40files/NIMH/Images/10004.png

On import, the Assay grid will display the image thumbnail as shown below:

User Options

Fields of this type point to registered users of the LabKey Server system, found in the table core.Users.

Subject/Participant Options

This field type is only available for Study datasets and for Sample Types that will be linked to study data. The Subject/Participant ID is a concept URI, containing metadata about the field. It is used in assay and study folders to identify the subject ID field. There is no special built-in behavior associated with this type. It is treated as a string field, without the formatting options available for text fields.

Lookup Options

You can populate a field with data via lookup into another table. This is similar to the text choice field, but offers additional flexibility and longer lists of options.

Open the details panel and select the folder, schema, and table where the data values will be found. Users adding data for this field will see a dropdown populated with that list of values. Typing ahead will scroll the list to the matching value. When the number of available values exceeds 10,000, the field will be shown as a text entry field.

Use the checkbox to control whether the user entered value will need to match an existing value in the lookup target.

  • Lookup Definition Options:
    • Select the Target Folder, Schema, and Table from which to look up the value. Once selected, the value will appear in the top row of the field description as a direct link to the looked-up table.
    • Lookup Validator: Ensure Value Exists in Lookup Target. Check the box to require that any value is present in the lookup's target table or query.
  • Name and Linking Options
  • Conditional Formatting Options: Conditional formats are available.
A lookup operates as a foreign key (<fk>) in the XML schema generated for the data structure. An example of the XML generated:
<fk>
<fkDbSchema>lists</fkDbSchema>
<fkTable>Languages</fkTable>
<fkColumnName>LanguageId</fkColumnName>
</fk>

Note that lookups into lists with auto-incrementing keys may not export/import properly because the rowIds are likely to be different in every database.

Learn more about Lookup fields in this topic:

Sample Options

  • Sample Options: Select where to look up samples for this field.
    • Note that this lookup will only be able to reference samples in the current container.
    • You can choose All Samples to reference any sample in the container, or select a specific sample type to filter by.
    • This selection will be used to validate and link incoming data, populate lists for data entry, etc.
  • Name and Linking Options
  • Conditional Formatting Options: Conditional formats are available.

Ontology Lookup Options (Premium Feature)

When the Ontology module is loaded, the Ontology Lookup field type connects user input with preferred vocabulary lookup into loaded ontologies.

Visit Date and Visit ID Options

Integration of Sample Types with Study data is supported using Visit Date and Visit ID field types provide time alignment, in conjunction with a Subject/Participant field providing participant alignment. Learn more in this topic: Link Sample Data to Study

Unique ID Options (Premium Feature)

A field of type "Unique ID" is read-only and used to house barcode values generated by LabKey for Samples. Learn more in this topic: Barcode Fields

Related Topics




Text Choice Fields


A Text Choice field lets you define a set of values that will be presented to the user as a dropdown list. This is similar to a lookup field, but does not require a separate list created outside the field editor. Using such controlled vocabularies can both simplify user entry and ensure more consistent data.

This topic covers the specifics of using the Text Choice Options for the field. Details about the linking and conditional formatting of a text choice field are provided in the shared field property documentation.

Create and Populate a Text Choice Field

Open the field editor for your data structure, and locate or add a field of type Text Choice.

Click Add Values to open a panel where you can add the choices for this field. Choices may be multi-word.

Enter each value on a new line and click Apply.

Manage Text Choices

Once a text choice field contains a set of values, the field summary panel will show at least the first few values for quick reference.

Expand the field details to see and edit this list.

  • Add Values lets you add more options.
  • Delete a value by selecting, then clicking Delete.
  • Select a value in the text choice options to edit it. For example, you might change the spelling or phrasing without needing to edit existing rows.
  • Click Apply to save your change to this value. You will see a message indicating updates to existing rows will be made.
  • Click Save for the data structure when finished editing text choices.

In-Use and Locked Text Choices

Any values that are already in use in the data will be marked with a icon indicating that they cannot be deleted, but the text of the value may be edited.

Values that are in-use for any read-only data (i.e. assay data that cannot be edited) will be marked with a icon indicating that they cannot be deleted or edited.

PHI Considerations

When a text choice field is marked as PHI, the usual export and publish study options will be applied to data in the field. However, users who are able to access the field editor will be able to see all of the text choices available, though without any row associations. If you are also using the compliance features letting you hide PHI from unauthorized users, this could create unintended visibility of values that should be obscured.

For more complete protection of protected health information, use a lookup to a protected list instead.

Use a Text Choice Field

When a user inserts into or edits the value of the Text Choice field, they can select one of the choices, or leave it blank if it is not marked as a required field. Typing ahead will narrow longer lists of choices. Entering free text is not supported. New choices can only be added within the field definition.

Change Between Text and Text Choice Field Types

Fields of the Text and Text Choice types can be switched to the other type without loss of data. Such changes have a few behaviors of note:

  • If you change a Text Choice field to Text, you will lose any values on the list of options for the field that are not in use.
  • If you change a Text field to Text Choice, all distinct values in that column will be added to the values list for the field. This option creates a handy shortcut for when you are creating new data structures including text choice fields.

Importing Text Choices: A Shortcut

If you have an existing spreadsheet of data (such as for a list or dataset) and want a given field to become a text choice field, you have two options.

  • The more difficult is to create the data structure with the field of type Text Choice, and then copy and paste all the distinct values from the dataset into the drop-down options for the field, then import your data.
  • An easier option is to first create the structure and import the data selecting Text as the type for the field you want to be a dropdown. Once imported, edit the data structure to change the type of that field to Text Choice. All distinct values from the column will be added to the list of value options for you.

Related Topics




URL Field Property


Setting the URL property of a field in a data grid turns the display value into a link to other content. The URL property setting is the target address of the link. The URL property supports different options for defining relative or absolute links, and also supports using values from other fields in constructing the URL.

Link Format Types

Several link format types for URL property are supported: local, relative, and full path links on the server as well as external links. Learn more about the context, controller, action, and path elements of LabKey URLs in this topic: LabKey URLs.

Local Links

To link to content in the current LabKey folder, use the controller and action name, but no path information.

<controller>-<action>

For example, to view a page in a local wiki:

wiki-page.view?name=${PageName}

A key advantage of this format is that the list or query containing the URL property can be moved or copied to another folder with the target wiki page, in this case, and it will still continue to work correctly.

You can optionally prepend a / (slash) or ./ (dot-slash), but they are not necessary. You could also choose to format with the controller and action separated by a / (slash). The following format is equivalent to the above:

./wiki/page.view?name=${PageName}

Relative Links

To point to resources in subfolders of the current folder, prepend the local link format with path information, using the . (dot) for the current location:

./<subfoldername/><controller>-<action>

For example, to link to a page in a subfolder, use:

./${SubFolder}/wiki-page.view?name=${PageName}

You can also use .. (two dots) to link to the parent folder, enabling path syntax like "../siblingfolder/resource to refer to a sibling folder. Use caution when creating complex relative URL paths as it makes references more difficult to follow in the case of moving resources. Consider a full path to be more clear when linking to a distant location.

Full Path Links on the Same Server

A full path link points to a resource on the current LabKey Server, useful for:

  • linking commmon resources in shared team folders
  • when the URL is a WebDAV link to a file that has been uploaded to the current server
The local path begins with a / (forward slash).

For example, if a page is in a subfolder of "myresources" under the home folder, the path might be:

/home/myresources/subfolder/wiki-page.view?name=${Name}

External Links

To link to a resource on an external server or any website, include the full URL link. If the external location is another labkey server, the same path, controller, and action path information will apply:

http://server/path/page.html?id=${Param}

Substitution Syntax

If you would like to have a data grid contain a link including an element from elsewhere in the row, the ${ } substitution syntax may be used. This syntax inserts a named field's value into the URL. For example, in a set of experimental data where one column contains a Gene Symbol, a researcher might wish to quickly compare her results with the information in The Gene Ontology. Generating a URL for the GO website with the Gene Symbol as a parameter will give the researcher an efficient way to "click through" from any row to the correct gene.

An example URL (in this case for the BRCA gene) might look like:

http://amigo.geneontology.org/amigo/search/ontology?q=brca

Since the search_query parameter value is the only part of the URL that changes in each row of the table, the researcher can set the URL property on the GeneSymbol field to use a substitution marker like this:

http://amigo.geneontology.org/amigo/search/ontology?q=${GeneSymbol}

Once defined, the researcher would simply click on "BRCA" in the correct column to be linked to the URL with the search_query parameter applied.

Multiple such substitution markers can be used in a single URL property string, and the field referenced by the markers can be any field within the query.

Substitutions are allowed in any part of the URL, either in the main path, or in the query string. For example, here are two different formats for creating links to an article in wikipedia, here using a "CompanyName" field value:

  • as part of the path:
  • as a parameter value:

Built-in Substitution Markers

The following substitutions markers are built-in and available for any query/dataset. They help you determine the context of the current query.

MarkerDescriptionExample Value
${schemaName}The schema where the current query lives.study
${schemaPath}The schema path of the current query.assay.General.MyAssayDesign
${queryName}The name of the current queryPhysical Exam
${dataRegionName}The data region for the current query.Dataset
${containerPath}The LabKey Server folder path, starting with the project/home/myfolderpath
${contextPath}The Tomcat context path/labkey
${selectionKey}Unique string used by selection APIs as a key when storing or retrieving the selected items for a grid$study$Physical Exam$$Dataset

Link Display Text

The display text of the link created from a URL property is just the value of the current record in the field which contains the URL property. So in the Gene Ontology example, since the URL property is defined on the Gene_Symbol field, the gene symbol serves as both the text of the link and the value of the search_query parameter in the link address. In many cases you may want to have a constant display text for the link on every row. This text could indicate where the link goes, which would be especially useful if you want multiple such links on each row.

In the example above, suppose the researcher wants to be able to look up the gene symbol in both Gene Ontology and EntrezGene. Rather than defining the URL Property on the Gene_Symbol field itself, it would be easier to understand if two new fields were added to the query, with the value in the fields being the same for every record, namely "[GO]" and "[Entrez]". Then set the URL property on these two new fields to

for the GOlink field:

http://amigo.geneontology.org/cgi-bin/amigo/search.cgi?search_query=${Gene_Symbol}&action=new-search

for the Entrezlink field:

http://www.ncbi.nlm.nih.gov/gene/?term=${Gene_Symbol}

The resulting query grid will look like:

Note that if the two new columns are added to the underlying list, dataset, or schema table directly, the link text values would need to be entered for every existing record. Changing the link text would also be tedious. A better approach is to wrap the list in a query that adds the two fields as constant expressions. For this example, the query might look like:

SELECT TestResults.SampleID,
TestResults.TestRun,
TestResults.Gene_Symbol,
TestResults.ResultValueN,

'[GO]' AS GOlink,
'[Entrez]' AS Entrezlink

FROM TestResults

Then in the Edit Metadata page of the Schema Browser, set the URL properties on these query expression fields:

URL Encoding Options

You can specify the type of URL encoding for a substitution marker, in case the default behavior doesn't work for the URLs needed. This flexibility makes it possible to have one column display the text and a second column can contain the entire href value, or only a part of the href. The fields referenced by the ${ } substitution markers might contain any sort of text, including special characters such as question marks, equal signs, and ampersands. If these values are copied straight into the link address, the resulting address would be interpreted incorrectly. To avoid this problem, LabKey Server encodes text values before copying them into the URL. In encoding, characters such as ? are replaced by their character code %3F. By default, LabKey encodes all special character values except '/' from substitution markers. If you know that a field referenced by a substitution marker needs no encoding (because it has already been encoded, perhaps) or needs different encoding rules, inside the ${ } syntax, you can specify encoding options as described in the topic String Expression Format Functions.

Links Without the URL Property

If the data field value contains an entire url starting with an address type designator (http:, https:, etc), then the field value is displayed as a link with the entire value as both the address and the display text. This special case could be useful for queries where the query author could create a URL as an expression column. There is no control over the display text when creating URLs this way.

Linking To Other Tables

To link two tables, so that records in one table link to filtered views of the other, start with a filtered grid view of the target table, filtering on the target fields of interest. For example, the following URL filters on the fields "WellLocation" and "WellType":

/home/demo%20study/study-dataset.view?datasetId=5018&Dataset.WellLocation~eq=AA&Dataset.WellType~eq=XX

Parameterize by adding substitution markers within the filter. For example, assume that source and target tables have identical field names, "WellLocation" and "WellType":

/home/demo%20study/study-dataset.view?datasetId=5018&Dataset.WellLocation~eq=${WellLocation}&Dataset.WellType~eq=${WellType}

Finally, set the parameterized URL as the URL property of the appropriate column in the source table.

Related Topics

For an example of UI usage, see: Step 3: Add a URL Property and view an interactive example by clicking here. Hover over a link in the Department column to see the URL, click to view a list filtered to display the "technicians" in that department.

For examples of SQL metadata XML usage, see: JavaScript API Demo Summary Report and the JavaScript API Tutorial.




Conditional Formats


Conditional formats change how data is displayed depending on the value of the data. For example, if temperature goes above a certain value, you can show those values in red (or bold or italic, etc). If the value is below a certain level those could be blue.

Conditional formats are defined as properties of fields using the Field Editor.

Notes:

Specify a Conditional Format

To specify a conditional format, open the field editor, and click to expand the field of interest. Under Create Conditional Format Criteria, click Add Format.

In the popup, identify the condition(s) under which you want the conditional format applied. Specifying a condition is similar to specifying a filter. You need to include a First Condition. If you specify a second one, both will be AND-ed together to determine whether a single conditional format is displayed.

Only the value that is being formatted is available for the checks. That is, you cannot use the value in column A to apply a conditional format to column B.

Next, you can specify Display Options, meaning how the field should be formatted when that condition is met.

Display options are:

  • Bold
  • Italic
  • Strikethrough
  • Colors: use dropdowns to select text (foreground) and fill (background) colors. Click a block to choose it, or type to enter a hex value or RGB values. You'll see a box of preview text on the right.

Click Apply to close the popup, then Save to save the datastructure. For a dataset, click View Data to see the dataset with your formatting applied.

Multiple Conditional Formats

Multiple conditional formats are supported in a single column. Before applying the format, you can click Add Formatting to add another. Once you have saved an active format, use Edit Formats to reopen the popup and click Add Formatting to specify another conditional format. This additional condition can have a different type of display applied.

Each format you define will be in a panel within the popup and can be edited separately.

If a data cell fulfills multiple conditions, then the first condition satisfied is applied, and conditions lower on the list are ignored.

For example, suppose you have specified two conditional formats on one field:

  • If the value is 40 degrees or greater, then display in bold text.
  • If the value is 38 degrees or greater, then display in italic text.
Although the value 40 fulfills both conditions, only the first condition to apply is considered, resulting in bold display. A sorted list of values would be displayed as shown below:

41
40
39
38
37

Specify Conditional Formats as Metadata XML

Conditional formats can be specified (1) as part of a table definition and/or (2) as part of a table's metadata XML. When conditional formats are specified in both places, the metadata XML takes precedence over the table definition.

You can edit conditional formats as metadata XML source. It is important that the <filters> element is placed first within the <conditionalFormat> element. Modifiers to apply to that filter result, such as bold, italics, and colors must follow the <filters> element.

In the metadata editor, click Edit Source. The sample below shows XML that specifies that values greater than 37 in the Temp_C column should be displayed in bold.

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Physical Exam" tableDbType="NOT_IN_DB">
<columns>
<column columnName="Temp_C">
<conditionalFormats>
<conditionalFormat>
<filters>
<filter operator="gt" value="37"/>
</filters>
<bold>true</bold>
</conditionalFormat>
</conditionalFormats>
</column>
</columns>
</table>
</tables>

Example: Conditional Formats for Human Body Temperature

In the following example, values out of the normal human body temperature range are highlighted with color if too high and shown in italics if too low. In this example, we use the Physical Exam dataset that is included with the importable example study.

  • In a grid view of the Physical Exam dataset, click Manage.
  • Click Edit Definition.
  • Select a field (in this case Temp_C), expand it, and click Add Format under "Create Conditional Format Criteria".
    • For First Condition, choose "Is Greater Than", enter 37.8.
    • From the Text Color drop down, choose red text.
    • This format option is shown above.
  • Click Add Formatting in the popup before clicking Apply.
  • For First Condition, choose "Is Less Than", enter 36.1.
  • Check the box for Italics.
  • Click Apply, then Save.
  • Click View Data to return to the data grid.

Now temperature values above 37.8 degrees are in red and those below 36.1 are displayed in italics.

When you hover over a formatted value, a pop up dialog will appear explaining the rule behind the format.

Related Topics




String Expression Format Functions


Reference

The following string formatters can be used when building URLs, or creating naming patterns for samples, sources and members of other DataClasses.

 NameSynonymInput TypeDescriptionExample
General
defaultValue(string) anyUse the string argument value as the replacement value if the token is not present or is the empty string.${field:defaultValue('missing')}
passThroughnoneanyDon't perform any formatting.${field:passThrough}
URL Encoding
encodeURIuristringURL encode all special characters except ',/?:@&=+$#' like JavaScript encodeURI()${field:encodeURI}
encodeURIComponenturicomponentstringURL uncode all special characters like JavaScript encodeURIComponent()${field:encodeURIComponent}
htmlEncodehtmlstringHTML encode${field:htmlEncode}
jsString stringEscape carrage return, linefeed, and <>"' characters and surround with a single quotes${field:jsString}
urlEncodepathstringURL encode each path part preserving path separator${field:urlEncode}
String
join(string) collectionCombine a collection of values together separated by the string argument${field:join('/'):encodeURI}
prefix(string) string, collectionPrepend a string argument if the value is non-null and non-empty${field:prefix('-')}
suffix(string) string, collectionAppend a string argument if the value is non-null and non-empty${field:suffix('-')}
trim stringRemove any leading or trailing whitespace${field:trim}
Date
date(string) dateFormat a date using a format string or one of the constants from Java's DateTimeFormatter. If no format value is provided, the default format is 'BASIC_ISO_DATE'${field:date}, ${field:date('yyyy-MM-dd')}
Number
format numberFormat a number using Java's DecimalFormat${field:number('0000')}
Array
first collectionTake the first value from a collection${field:first:defaulValue('X')}
rest collectionDrop the first item from a collection${field:rest:join('_')}
last collectionDrop all items from the collection except the last${field:last:suffix('!')}

Examples

FunctionApplied to...Result
${Column1:defaultValue('MissingValue')}nullMissingValue
${Array1:join('/')}[apple, orange, pear]apple/orange/pear
${Array1:first}[apple, orange, pear]apple
${Array1:first:defaultValue('X')}[(null), orange, pear] X



Date & Number Display Formats


LabKey Server provides flexible formatting for dates, times, and numbers, so you can control how they are displayed to users. Using formatting you can:
  • Specify how dates are displayed, for example:
    • 04/05/2016
    • May 4 2016
    • Wednesday May 4
  • Specify how times are displayed, for example
    • 01:23pm
    • 1:23pm Pacific Standard Time
    • 1:23:52.127
  • Specify how numbers are displayed, for example
    • 1.1
    • 1.10
  • Determine granularity of the date/time display, for example:
    • June 4 2016
    • June 4 2016 1pm
    • June 4 2016 1:20pm
  • Set up formats that apply to the entire site, an entire project, or a folder.
  • Override more generally prescribed formats in a particular context, for example, specify that a particular field or folder should follow a different format than the parent container.
Note that date formatting described in this topic is different from date parsing. Formatting determines how date and time data are displayed by the server. Parsing determines how the server interprets date strings.

You can customize how dates, times and numbers are displayed on a field-by-field basis, or specify default formats on a folder-level, project-level or site-wide basis. The server decides which format to use for a particular field by looking first at the properties for that field. If no format property is found at the field-level, it checks the container tree, starting with the folder then up the folder hierarchy to the site level. In detail, decision process goes as follows:

  • The server checks to see if there is a field-level format set on the field itself. If it finds a field-specific format, it uses that format. If no format is found, it looks to the folder-level format. (To set a field-specific format, see Set Formats on a Per-Field Basis.)
  • If a folder-level format is found, it uses that format. If no folder-level format is found, it looks in the parent folder, then that parent's parent folder, etc. until the project level is reached and it looks there. (To set a folder-level default format, see Set Folder Display Formats)
  • If a project-level format is found, it uses that format. If no project-level format is found, it looks to the site-level format. (To set a project-level default format, see Set Project Display Formats.)
  • To set the site-level format, see Set Formats Globally (Site-Level). Note that if no site-level format is specified, the server will default to these formats:
    • Date field: yyyy-MM-dd
    • Date-time field: yyyy-MM-dd HH:mm
When LabKey Server is first installed, it uses these initial formatting patterns:
  • Date fields: Year-Month-Date, which is the standard ISO date format, for example 2010-01-31. The Java string format is yyyy-MM-dd
  • Date-time fields: Year-Month-Date Hour:Minute, for example 2010-01-31 9:30. The Java string format is yyyy-MM-dd HH:mm
  • For date-time fields where displaying a time value would be superfluous, the system overrides the site-wide initial format (Year-Month-Date Hour:Minute) and instead uses a date-only format (Year-Month-Date). Examples include visit dates in study datasets, and BirthDate in various patient schemas. To change the format behavior on these fields, override the query metadata -- see Query Metadata.
A standard Java format string specifies how dates, times, and numbers are displayed. For example, the format string

yyyy-MM-dd

specifies dates to be displayed as follows

2000-01-01

For details on format strings, see Date and Number Formats Reference.

Set Formats Globally (Site-Level)

An admin can set formats at the site level by managing look and feel settings.

  • Select (Admin) > Site > Admin Console.
  • Under Configuration, click Look and Feel Settings.
  • Scroll down to Customize date and number formats.
  • Enter format strings as desired for date, date-time, or number fields.
  • Click Save.

Set Project Display Formats

An admin can standardize display formats at the project level so they display consistently in the intended scope, which does not need to be consistent with site-wide settings.

  • Navigate to the target project.
  • Select (Admin) > Folder > Project Settings.
  • On the Properties tab, scroll down to Customize date and number formats.
  • Enter format strings as desired for date, date-time, or number fields.
  • Click Save.

Set Folder Display Formats

An admin can standardize display formats at the folder level so they display consistently in the intended scope, which does not need to be consistent with either project or site settings.

  • Navigate to the target folder.
  • Select (Admin) > Folder > Management.
  • Click the Formats tab.
  • Enter format strings as desired for date, date-time, or number fields.
  • Click Save.

Set Formats on a Per-Field Basis

To do this, you edit the properties of the field.

  • Open the data fields for editing; the method depends on the type. See: Field Editor.
  • Use the to expand the field you want to edit.
  • Number and Date types include a Format for [Dates/Numbers] box.
  • Enter the desired format string directly, or use the shortcuts described below. Shown here, the shortcut code "Date".
  • Click Save.

Date Format Shortcuts

At the field-level, instead of providing a specific format string, you can use one of the following shortcut values to specify a standard format. A shortcut value tells the server to use the current folder's format setting (a format which may be inherited from the project or site setting).

Format Shortcut StringDescription
DateUse the folder-level format setting, specified at (Admin) > Folder > Management > Formats tab > Default display format for dates.
DateTimeUse the folder-level format setting, specified at (Admin) > Folder > Management > Formats tab > Default display format for date-times.
TimeCurrently hard-coded to "HH:mm:ss"

Related Topics




Date and Number Formats Reference


The following reference accompanies the topic Date & Number Display Formats.

Date and DateTime Format Strings

Format strings used to describe dates and date-times on the LabKey platform must be compatible with the format accepted by the Java class SimpleDateFormat. For more information see the Java documentation. The following table has a partial guide to pattern symbols.

LetterDate/Time ComponentExamples
GEra designatorAD
yYear1996; 96
MMonth in yearJuly; Jul; 07
wWeek in year27
WWeek in month2
DDay in year189
dDay in month10
FDay of week in month2
EDay in weekTuesday; Tue
aAm/pm markerPM
HHour in day (0-23)0
kHour in day (1-24)24
KHour in am/pm (0-11)0
h .......Hour in am/pm (1-12) .......12 .......
mMinute in hour30
sSecond in minute33
SMillisecond978
zTime ZonePacific Standard Time; PST; GMT-08:00
ZTime Zone-0800
XTime Zone-08; -0800; -08:00

To control whether an internationally ambiguous date string such as 04/06/2014 should be interpreted as Day-Month-Year or Month-Day-Year, an admin can set the date parsing format at the site level.

Note that the LabKey date parser does not recognize time-only date strings. This means that you need to enter a full date string even when you wish to display time only. For example, you might enter a value of "2/2/09 4:00 PM" in order to display "04 PM" when using the format string "hh aa".

Number Format Strings

Format strings for Number (Double) fields must be compatible with the format that the java class DecimalFormat accepts. A valid DecimalFormat is a pattern specifying a prefix, numeric part, and suffix. For more information see the Java documentation. The following table has an abbreviated guide to pattern symbols:

SymbolLocationLocalized?Meaning
0NumberYesDigit
#NumberYesDigit, zero shows as absent
.NumberYesDecimal separator or monetary decimal separator
-NumberYesMinus sign
,NumberYesGrouping separator
ENumberYesExponent for scientific notation, for example: 0.#E0

Format Shortcuts

At the field level, instead of providing a specific format string, you can use a shortcut value for commonly used formats. For details, see Date & Number Display Formats

Examples

The following examples apply to Number (Double) fields.

Format StringDisplay Result
yyyy-MM-dd HH:mm2008-05-17 01:45
yyyy-MM-dd HH:mmaa2008-05-17 01:45PM
yyyy-MM-dd HH:mm:ss.SSS2008-05-17 01:45:55.127
MMMM dd yyyyMay 17 2008
hh:mmaa zzzz01:45PM Pacific Daylight Time
<no string>85.0
085
000085
.0085.00
000.000085.000
000,000085,000
-000,000-085,000
0.#E08.5E1

Java Reference Documents

Dates: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/text/SimpleDateFormat.html

Numbers: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/text/DecimalFormat.html

Related Topics




Lookup Columns


By combining data from multiple tables in one grid view, you can create integrated grids and visualizations with no duplication of data. Lookup Columns give you the ability to link data from two different tables by reference. Another name for this type of connection is a "foreign key".

Once a lookup column pulls in data from another table, you can then display values from any column in that target (or "looked up") table. You can also take advantage of a lookup to simplify user data entry by constraining entered values for a column to a fixed set of values in another list.

Set Up a Lookup Field

Suppose you want to display values from the Languages list, such as the translator information, alongside other data from the Demographics dataset. You would add a lookup column to the Demographics dataset that used values from the Languages list.

To join these tables, an administrator adds a lookup column to the Demographics dataset definition using this interface:

  • Go to the dataset or list where you want to show the data from the other source. Here, Demographics.
  • Click Manage, then Edit Definition.
  • Click Fields.
  • Use to expand the field you want to be a lookup (here "language").
    • If the field you want doesn't already exist, click Add Field to add it.
  • From the dropdown under Data Type, select Lookup.
  • In the Lookup Definition Options, select the Target Folder, Schema, and Table to use. For example, the lists schema and the Languages table, as shown below.
  • Scroll down and click Save.
  • In either the expanded or collapsed view, you can click the name of the target of the lookup to open it directly.
  • The lookup column is now available for grid views and SQL queries on the Demographics table.

This can also be accomplished using a SQL query, as described in Lookups: SQL Syntax.

Create a Joined Grid View

Once you have connected tables with a lookup, you can create joined grids merging information from any of the columns of both tables. For example, a grid showing which translators are needed for each cohort would make it possible to schedule their time efficiently. Note that the original "Language" column itself need not be shown.

  • Go to the "Demographics" data grid.
  • Select (Grid Views) > Customize Grid.
  • Click the to expand the lookup column and see the fields it contains. (It will become a as shown below.)
  • Select the fields you want to display using checkboxes.
  • Save the grid view.

Learn more about customizing grid views in this topic:

Default Display Field

For lookup fields where the target table has an integer primary key, the server will use the first text field it encounters as the default display column. For example, suppose the Language field is an integer lookup to the Languages table, as below.

In this case, the server uses Language Name as the default display field because it is the first text field it finds in the looked up table. You can see this in the details of the lookup column shown in the example above. The names "English", etc, are displayed though the lookup is to an integer key.

Display Alternate Fields

To display other fields from the looked up table, go to (Grid Views) > Customize View, expand the lookup column, and select the fields you want to display.

You can also use query metadata to achieve the same result: see Query Metadata: Examples

Validating Lookups: Enforcing Lookup Values on Import

When you are importing data into a table that includes a lookup column, you can have the system enforce the lookup values, i.e. any imported values must appear in the target table. An error will be displayed whenever you attempt to import a value that is not in the lookup's target table.

To set up enforcement:

  • Go to the field editor of the dataset or list with the lookup column.
  • Select the lookup column in the Fields section.
  • Expand it by clicking the .
  • Under Lookup Validator, check the box for Ensure Value Exists in Lookup Target.
  • Click Save.

Note that pre-existing data is not retroactively validated by turning on the lookup validator. To ensure pre-existing data conforms to the values in the lookup target table, either review entries by hand or re-import to confirm values.

Suppress Linking

By default the display value in a lookup field links to the target table. To suppress the links, and instead display the value/data as bare text, edit the XML metadata on the target table. For example, if you are looking up to MyList, add <tableUrl></tableUrl> to its metadata, as follows. This will suppress linking on any table that has a lookup into MyList.

<tables xmlns="http://labkey.org/data/xml">
<table tableName="MyList" tableDbType="NOT_IN_DB">
<tableUrl></tableUrl>
<columns>
</columns>
</table>
</tables>

A related option is to use a SQL annotation directly in the query to not "follow" the lookup and display the column as if there were no foreign key defined on it. Depending on the display value of the lookup field, you may see a different value than if you used the above suppression of the link. Learn more here: Query Metadata.

Related Topics




Protecting PHI Data


In many research applications, Protected Health Information (PHI) is collected and available to authorized users, but must not be shared with unauthorized users. This topic covers how to mark columns (fields) as different levels of PHI and how you can use these markers to control access to information.

Administrators can mark columns as either Restricted PHI, Full PHI, Limited PHI, or Not PHI. Simply marking fields with a particular PHI level does not restrict access to these fields. When a folder is exported or a study is published, data at or above a given PHI levels can be excluded, creating a version of the data without the marked PHI.

Mark Column at PHI Level

There are four levels of PHI setting available:

  • Restricted PHI: Most protected
  • Full PHI
  • Limited PHI: Least protected
  • Not PHI (Default): Not protected
  • Open the Field Editor for the data structure you are marking.
  • Click the Fields section if it is not already open.
  • Use the to expand the field you want to mark.
  • Click Advanced Settings.
  • From the PHI Level dropdown, select the level at which to mark the field. Shown here, the "gender" field is being marked as "Limited PHI".
  • Click Apply.
  • Continue to mark other fields in the same data structure as needed.
  • Click Save when finished.

Now that your fields are marked, you can use this information to control export and publication of studies and folders.

Export without PHI

When you export a folder or study, you can select the level(s) of PHI to include. By default, all columns are included, so to exclude any columns, you must make a selection as follows:

  • Select (Admin) > Folder > Management.
  • Click the Export tab.
  • Select the objects you want to export.
  • Under Options, choose which levels of PHI you want to include.
  • Uncheck the Include PHI Columns box to exclude all columns marked as PHI.
  • Click Export.

The exported archive can now be shared with users who can have access to the selected level of PHI.

Publish Study without PHI

When you publish a study, you create a copy in a new folder, using a wizard to select the desired study components. On the Publish Options panel, you can select the PHI you want to include. The default is to publish with all columns.

  • In a study, click the Manage tab.
  • Click Publish Study.
  • Complete the study wizard selecting the desired components.
  • On the Publish Options panel, under Include PHI Columns, select the desired levels to include.
  • Click Finish.

The published study folder will only contain the columns at the level(s) you included.

Issues List and Limited PHI

An issue tracking list ordinarily shows all fields to all users. If you would like to have certain fields only available to users with permission to insert or update issues (Submitter, Editor, or Admins), you can set fields to "Limited PHI".

For example, a development bug tracker could have a "Limited PHI" field indicating the estimated work needed to resolve it. This field would then be hidden from readers of the list of known issues but visible to the group planning to fix them.

Learn more about customizing issue trackers in this topic: Issue Tracker: Administration

Use PHI Levels to Control UI Visibility of Data


Premium Features Available

Subscribers to the Enterprise Edition of LabKey Server can use PHI levels to control display of columns in the user interface. Learn more in this topic:


Learn more about premium editions

Note that if your data uses any text choice fields, administrators and data structure editors will be able to see all values available within the field editor, making this a poor field choice for sensitive information.

Related Topics




Data Grids


Data grids display data from various data structures as a table of columns and rows. LabKey Server provides sophisticated tools for working with data grids, including sorting, filtering, reporting, and exporting.

Take a quick tour of the basic data grid features here: Data Grid Tour.

Data Grid Topics

Related Topics




Data Grids: Basics


Data in LabKey Server is viewed in grids. The underlying data table, list, or dataset stored in the database contains more information than is shown in any given grid view. This topic will help you learn the basics of using grids in LabKey to view and work with any kind of tabular data.

Anatomy of a Data Grid

The following image shows a typical data grid.

  • Data grid title: The name of the data grid.
  • QC State filter: Within a LabKey study, when one or more dataset QC states are defined, the button bar will include a menu for filtering among them. The current status of that filter is shown above the grid.
  • Folder location: The folder location. Click to navigate to the home page of the folder.
  • Grid view indicator: Shows which view, or perspective, on the data is being shown. Custom views are created to show a joined set of tables or highlight particular columns in the data. Every dataset has a "default" view that can also be customized if desired.
  • Filter bar: When filters are applied, they appear here. Click the X to remove a filter. When more than one is present, a Clear all button is also shown.
  • Button bar: Shows the different tools that can be applied to your data. A triangle indicates a menu of options.
  • Column headers: Click a column header for a list of options.
  • Data records: Displays the data as a 2-dimensional table of rows and columns.
  • Page through data: Control how much data is shown per page and step through pages.

Column Header Options

See an interactive example grid on labkey.org.

Button Bar

The button bar tools available may vary with different types of data grid. Study datasets can provide additional functionality, such as filtering by cohort, that are not available to lists. Assays and proteomics data grids provide many additional features.

Hover over icon buttons to see a tooltip with the text name of the option. Common buttons and menus include:

  • (Filter): Click to open a panel for filtering by participant groups and cohorts.
  • (Grid Views): Pull-down menu for creating, selecting, and managing various grid views of this data.
  • (Charts/Reports): Pull-down menu for creating and selecting charts and reports.
  • (Export): Export the data grid as a spreadsheet, text, or in various scripting formats.
  • (Insert Data): Insert a new single row, or bulk import data.
  • (Delete): Delete one or more rows selected by checkbox. This option is disabled when no rows are selected.
  • Design/Manage: with appropriate permissions, change the structure and behavior of the given list ("Design") or dataset ("Manage").
  • Groups: Select which cohorts, participant groups, or treatment groups to show. Also includes options to manage cohorts and create new participant groups.
  • Print: Generate a printable version of the dataset in your browser.

Page Through Data

In the upper right corner, you will see how your grid is currently paginated. In the above example, there are 484 rows in the dataset and they are shown 100 to a "page", so the first page is rows 1-100 of 484. To step through pages, in this case 100 records at a time, click the arrow buttons. To adjust paging, hover over the row count message (here "1-100 of 484") and it will become a button. Click to open the menu:

  • Show First/Show Last: Jump to the first or last page of data.
  • Show All: Show all rows in one page.
  • Click Paging to see the currently selected number of rows per page. Notice the > caret indicating there is a submenu to open. On the paging submenu, click another option to change pagination.

Examples

The following links show different views of a single data grid:

Related Topics




Import Data


LabKey provides a variety of methods for importing data into a data grid. Depending on the type and complexity of the data, you must first identify the type of data structure in which your data will be stored. Each data structure has a different specific process for designing a schema and then importing data. The general import process for data is similar among many data structures. Specific import instructions for many types are available here:
Note: When data is imported as a TSV, CSV, or text file, it is parsed using UTF-8 character encoding.



Sort Data


This page explains how to sort data in a table or grid. The basic sort operations are only applied for the current viewer and session; they are not persisted by default, but can be saved as part of a custom grid view.

Sort Data in a Grid

To sort data in a grid view, click on the header for a column name. Choose either:

  • Sort Ascending: Show the rows with lowest values (or first values alphabetically) for this column at the top of the grid.
  • Sort Descending: Show the rows with the highest values at the top.

Once you have sorted your dataset using a particular column, a triangle icon will appear in the column header: for an ascending sort or for a descending sort.

You can sort a grid view using multiple columns at a time as shown above. Sorts are applied in the order they are added, so the most recent sort will show as the primary way the grid is sorted. For example, to sort by date, with same-date results sorted by temperature, you would first sort on temperature, then on date.

Note: By default, LabKey sorting is case-sensitive. If your LabKey installation is running against Microsoft SQL Server, however, sorting is case-insensitive. If you're not sure which database your LabKey installation is running against, ask your system administrator.

Clear Sorts

To remove a sort on an individual column, click the column caption and select Clear Sort.

Advanced: Understand Sorting URLs

The sort specifications are included on the page URL. You can modify the URL directly to change the sorted columns, the order in which they are sorted, and the direction of the sort. For example, the following URL sorts the Physical Exam grid first by ascending ParticipantId, and then by descending Temp_C:

Note that the minus ('-') sign in front of the Temp_C column indicates that the sort on that column is performed in descending order. No sign is required for an ascending sort, but it is acceptable to explicitly specify the plus ('+') sign in the URL.

The %2C hexadecimal code that separates the column names represents the URL encoding symbol for a comma.

Related Topics




Filter Data


You can filter data displayed in a grid to reduce the amount of data shown, or to exclude data that you do not wish to see. By default, these filters only apply for the current viewer and session, but filters may be saved as part of a grid view if desired.

Filter Column Values

  • Click on a column name and select Filter.

Filter by Value

In many cases, the filter popup will open on the Choose Values tab by default. Here, you can directly select one or more individual values using checkboxes. Click on a label to select only a single value, add or remove additional values by clicking on checkboxes.

This option is not the default in a few circumstances:

Filtering Expressions

Filtering expressions available in dropdowns vary by datatype and context. Possible filters include, but are not limited to:

  • Presence or absence of a value for that row
  • Equality or inequality
  • Comparison operators
  • Membership or lack of membership in a named semicolon separated set
  • Starts with and contains operators for strings
  • Between (inclusive) or Not Between (exclusive) two comma separated values
For a full listing, see Filtering Expressions.

  • Switch to the Choose Filters tab, if available.
  • Specify a filtering expression (such as "Is Greater Than"), and value (such as "57") and click OK.

You may add a second filter if desired - the second filter is applied as an AND with the first. Both conditions must be true for a row to be included in the filtered results.

Once you have filtered on a column, the filter icon appears in the column header. Current filters are listed above the grid, and can be removed by simply clicking the X in the filter panel.

When there are multiple active filters, you can remove them individually or use the link to Clear All that will be shown.

Notes:
  • Leading spaces on strings are not stripped. For example, consider a list filter like Between (inclusive) which takes two comma-separated terms. If you enter range values as "first, second", rows with the value "second" (without the leading space) will be excluded. Enter "first,second" to include such rows.
  • By default, LabKey filtering is case-sensitive. However, if your LabKey installation is running against Microsoft SQL Server, filtering is case-insensitive.

Filter Value Variables

Using filter value variables can help you use context-sensitive information in a filter. For example, use the variable "~me~" (including the tildes) on columns showing user names from the core.Users table to filter based on the current logged in user.

Additional filter value variables are available that will not work in the grid header menu, but will work in the grid view customizer or in the URL. For example, relative dates can be specified using filter values like "-7d" (7 days ago) or "5d" (5 days from now) in a saved named grid view. Learn more here: Saved Filters and Sorts

Persistent Filters

Some filters on some types of data are persistent (or "sticky") and will remain applied on subsequent views of the same data. For example, some types of assays have persistent filters for convenience; these are listed in the active filter bar above the grid.

Use Faceted Filtering

When applying multiple filters to a data grid, the options shown as available in the filter popup will respect prior filters. For example, if you first filter our sample demographics dataset by "Country" and select only "Uganda", then if you open a second filter on "Primary Language" you will see only "French" and "English" as options - our sample data includes no patients from Uganda who speak German or Spanish. The purpose is to simplify the process of filtering by presenting only valid filter choices. This also helps you avoid unintentionally empty results.

Understand Filter URLs

Filtering specifications can be included on the page URL. A few examples follow.

This URL filters our example "Physical Exam" dataset to show only rows where weight is greater than 80kg. The column name, the filter operator, and the criterion value are all specified as URL parameters. The dataset is specified by ID, "5003" in this case:

Multiple filters on different columns can be combined, and filters also support selecting multiple values. In this example, we show all rows for two participants with a specific data entry date:

To specify that a grid should be displayed using the user's last filter settings, set the .lastFilter URL parameter to true, as shown:

Study: Filter by Participant Group

Within a study dataset, you may also filter a data grid by participant group. Click the (Filter) icon above the grid to open the filter panel. Select checkboxes in this panel to further filter your data. Note that filters are cumulatively applied and listed in the active filter bar above the data grid.

Related Topics




Filtering Expressions


Filtering expressions available for columns or when searching for subjects of interest will vary by datatype of the column, and not all expressions are relevant or available in all contexts. In the following tables, the "Arguments" column indicates how many data values, if any, should be provided for comparison with the data being filtered.

ExpressionArgumentsExample UsageDescription
Has Any Value  Returns all values, including null
Is Blank  Returns blank values
Is Not Blank  Returns non-blank values
Equals1 Returns values matching the value provided
Does Not Equal1 Returns non-matching values
Is Greater Than1 Returns values greater than the provided value
Is Less Than1 Returns values less than the provided value
Is Greater Than or Equal To1 Returns values greater than or equal to the provided value
Is Less Than or Equal To1 Returns values less than or equal to the provided value
Contains1 Returns values containing a provided value
Does Not Contain1 Returns values not containing the provided value
Starts With1 Returns values which start with the provided value
Does Not Start With1 Returns values which do not start with the provided value
Between, Inclusive2, comma separated-4,4Returns values between or matching two comma separated values provided
Not Between, Exclusive2, comma separated-4,4Returns values which are not between and do not match two comma separated values provided
Equals One Of1 or more, semi-colon separateda;b;cReturns values matching any one of a semi-colon separated list
Does Not Equal Any Of1 or more, semi-colon separateda;b;cReturns values not matching a semi-colon separated list
Contains One Of1 or more, semi-colon separateda;b;cReturns values which contain any one of a semi-colon separated list
Does Not Contain Any Of1 or more, semi-colon separateda;b;cReturns values which do not contain any of a semi-colon separated list
Is In Subtree1 (Premium Feature) Returns values that are in the ontology hierarchy 'below' the selected concept
Is Not In Subtree1 (Premium Feature) Returns values that are not in the ontology hierarchy 'below' the selected concept

Boolean Filtering Expressions

Expressions available for data of type boolean (true/false values):

  • Has Any Value
  • Is Blank
  • Is Not Blank
  • Equals
  • Does Not Equal

Date Filtering Expressions

Date and DateTime data can be filtered with the following expressions:

  • Has Any Value
  • Is Blank
  • Is Not Blank
  • Equals
  • Does Not Equal
  • Is Greater Than
  • Is Less Than
  • Is Greater Than or Equal To
  • Is Less Than or Equal To

Numeric Filtering Expressions

Expressions available for data of any numeric type, including integers and double-precision numbers:

  • Has Any Value
  • Is Blank
  • Is Not Blank
  • Equals
  • Does Not Equal
  • Is Greater Than
  • Is Less Than
  • Is Greater Than or Equal To
  • Is Less Than or Equal To
  • Between, Inclusive
  • Not Between, Exclusive
  • Equals One Of
  • Does Not Equal Any Of

String Filtering Expressions

String type data, including text and multi-line text data, can be filtered using the following expressions:

  • Has Any Value
  • Is Blank
  • Is Not Blank
  • Equals
  • Does Not Equal
  • Is Greater Than
  • Is Less Than
  • Is Greater Than or Equal To
  • Is Less Than or Equal To
  • Contains
  • Does Not Contain
  • Starts With
  • Does Not Start With
  • Between, Inclusive
  • Not Between, Exclusive
  • Equals One Of
  • Does Not Equal Any Of
  • Contains One Of
  • Does Not Contain Any Of

Related Topics




Column Summary Statistics


Summary statistics for a column of data can be displayed directly on the grid, giving at-a-glance information like count, min, max, etc. of the values in that column.

Premium Feature — An enhanced set of additional summary statistics is available in all Premium Editions of LabKey Server. Learn more or contact LabKey.

Add Summary Statistics to a Column

  • Click a column header, then select Summary Statistics.
  • The popup will list all available statistics for the given column, including their values for the selected column.
  • Check the box for all statistics you would like to display.
  • Click Apply. The statistics will be shown at the bottom of the column.

Display Multiple Statistics

Multiple summary statistics can be shown at one time for a column and each column can have it's own set. Here is a compound set of statistics on another dataset:

Statistics Available

The list of statistics available vary based on the edition of LabKey Server you are running, and on the column datatype. Not all functions are available for all column types, only meaningful aggregates are offered. For instance, boolean columns show only the count fields; date columns do not include sums or means. Calculations ignore blank values, but note that values of 0 or "unknown" are not blank values.

All calculations use the current grid view and any filters you have applied. Remember that the grid view may be shown across several pages. Column summary statistics are for the dataset as a whole, not just the current page being viewed. The number of digits displayed is governed by the number format set for the container, which defaults to rounding to the thousandths place.

Summary statistics available in the Community edition include:

  • Count (non-blank): The number of values in the column that are not blank, i.e. the total number of rows for which there is data available.
  • Sum: The sum of the values in the column.
  • Mean: The mean, or average, value of the column.
  • Minimum: The lowest value.
  • Maximum: The highest value.
Additional summary statistics available in Premium editions of LabKey Server include:
  • Count (blank): The number of blank values.
  • Count (distinct): The number of distinct values.
  • Median: Orders the values in the column, then finds the midpoint. When there are an even number of values, the two values at the midpoint are averaged to obtain a single value.
  • Median Absolute Deviation (MAD): The median of the set of absolute deviations of each value from the median.
  • Standard Deviation (of mean): For each value, take the difference between the value and the mean, then square it. Average the squared deviations to find the variance. The standard deviation is the square root of the variance.
  • Standard Error (of mean): The standard deviation divided by the square of the number of values.
  • Quartiles:
    • Lower (Q1) is the midpoint between the minimum value and the median value.
    • Upper (Q3) is the midpoint between the median value and the maximum value. Both Q1 and Q3 are shown.
    • Interquartile Range: The number of values between Q3 and Q1 in the ordered list. Q3-Q1.

Save Summary Statistics with Grid View

Once you have added summary statistics to a grid view, you can use (Grid Views) > Customize Grid to save the grid with these statistics displayed. For example, this grid in our example study shows a set of statistics on several columns (scroll to the bottom): For example, this grid in our example study includes both column visualizations and summary statistics:

Related Topics




Customize Grid Views


A grid view is a way to see tabular data in the user interface. Lists, datasets, assay results, and queries can all be represented in a grid view. The default set of columns displayed are not always what you need. This topic explains how to create custom grid views in the user interface, showing selected columns in the order you wish. You can edit the default grid view, and also save as many named grid views as needed.

Editors, administrators, and users granted "Shared View Editor" access can create and share customized grid views with other users.

Customize a Grid View

To customize a grid, open it, then select (Grid Views) > Customize Grid.

The tabs let you control:

You can close the grid view customizer by clicking the X in the upper right.

Adjust Columns Displayed

On the Columns tab of the grid view customizer, control the fields included in your grid and how they are displayed in columns:

  • Available Fields: Lists the fields available in the given data structure.
  • Selected Fields: Shows the list of fields currently selected for display in the grid.
  • Delete: Deletes the current grid view. You cannot delete the default grid view.
  • Revert: Returns the grid to its state before you customized it.
  • View Grid: Click to preview your changes.
  • Save: Click to save your changes as the default view or as a new named grid.
Actions you can perform here are detailed below.

Add Columns

  • To add a column to the grid, check the box for it in the Available Fields pane.
  • The field will be added to the Selected Fields pane.
  • This is shown for the "Hemoglobin" field in the above image.

Notice the checkbox for Show hidden fields. Hidden fields might contain metadata about the grid displayed or interconnections with other data. Learn more about common hidden fields for lists and datasets in this topic:

Expand Nodes to Find More Fields

When the underlying table includes elements that reference other tables, they will be represented as an expandable node in the Available Fields panel. For example:

  • Fields defined to lookup values in other tables, such as the built in "Created By" shown above. In that case to show more information from the Users table.
  • In a study, there is built in alignment around participant and visit information; see the "Participant Visit" node.
  • In a study, all the other datasets are automatically joined through a special field named "Datasets". Expand it to see all other datasets that can be joined to this one. Combining datasets in this way is the equivalent of a SQL SELECT query with one or more inner joins.
  • Click and buttons to expand/collapse nodes revealing columns from other datasets and lookups.
  • Greyed out items cannot be displayed themselves, but can be expanded to find fields to display.

Reorder Columns

To reorder the columns, drag and drop the fields in the Selected Fields pane. Columns will appear left to right as they are listed top to bottom. Changing the display order for a grid view does not change the underlying data table.

Change Column Display Name

Hover over any field in the Selected Fields panel to see a popup with more information about the key and datatype of that field, as well as a description if one has been added.

To change the column display title, click the (Edit title) icon. Enter the new title and click OK. Note that this does not change the underlying field name, only how it is displayed in the grid.

Remove Columns

To remove a column, hover over the field in the Selected Fields pane, and click (Remove column) as shown in the image above.

Removing a column from a grid view does not remove the field from the dataset or delete any data.

You can also remove a column directly from the grid (without opening the view customizer) by clicking the column header and selecting Remove Column. After doing so, you will see the "grid view has been modified" banner and be able to revert, edit, or save this change.

Save Grid Views

When you are satisfied with your grid, click Save. You can save your version as the default grid view, or as a new named grid view.

  • Click Save.

In the popup, select:

  • Grid View Name:
    • "Default grid view for this page": Save as the grid view named "Default".
    • "Named": Select this option and enter a title for your new grid view. This name cannot be "Default" and if you use the same name as an existing grid view, your new version will overwrite the previous grid by that name.
  • Shared: By default a customized grid is private to you. If you have the "Editor" role or "Shared View Editor" role (or higher) in the current folder, you can make a grid available to all users by checking the box Make this grid view available to all users.
  • Inherit: If present, check the box to make this grid view available in child folders.
In this example, we named the grid "My Custom Grid View" and it was added to the (Grid Views) pulldown menu.

Saved grid views appear on the (Grid Views) pulldown menu. On this menu, the "Default" grid is always first, then any grids saved for the current user (shown with a icon), then all the shared grid views. If a customized version of the "Default" grid view has been saved but not shared by the current user, it will also show a lock icon, but can still be edited (or shared) by that user.

When the list of saved grid views is long, a Filter box is added. Type to narrow the list making it easier to find the grid view you want.

In a study, named grid views will be shown in the Data Views webpart when "Queries" are included. Learn more in this topic: databrowser

Save Filtered and Sorted Grids

You can further refine your saved grid views by including saved filters and sorts with the column configuration you select. You can define filters and sorts directly in the view customizer, or get started from the user interface using column menu filters and sorts.

Learn about saving filters and sorts with custom grid views in this topic:

Note: When saving or modifying grid views of assay data, note that run filters may have been saved with your grid view. For example, if you customize the default grid view while viewing a single run of data, then import a new run, you may not see the new data because of the previous filtering to the other single run. To resolve this issue:
  • Select (Grid Views) > Customize Grid.
  • Click the Filters tab.
  • Clear the Row ID filter by clicking the 'X' on it's line.
  • Click Save, confirm Default grid view for this page is selected.
  • Click Save again.

Include Column Visualizations and Summary Statistics

When you save a default or named grid, any Column Visualizations or Summary Statistics in the current view of the grid will be saved as well. This lets you include quick graphical and statistical information when a user first views the data.

For example, this grid in our example study includes both column visualizations and summary statistics:

Reset the Default Grid View

Every data structure has a grid view named "Default" but it does not have to be the default shown to users who view the structure.

  • To set the default view to another named grid view, select (Grid Views) > Set Default.
  • The current default view will be shown in bold.
  • Click Select for the grid you prefer from the list available. The newly selected on will now be bold.
  • Click Done.

Revert to the Original Default Grid View

  • To revert any customizations to the default grid view, open it using (Grid Views) > Default.
  • Select (Grid Views) > Customize Grid.
  • Click the Revert button.

Views Web Part

To create a web part listing all the customized views in your folder, an administrator can create an additional web part:

  • Enter > Page Admin Mode
  • In the lower left, select Views from the Select Web Part menu.
  • Click Add.
  • The web part will show saved grid views, reports, and charts sorted by categories you assign. Here we see the new grid view we just created.

Inheritance: Making Custom Grids Available in Child Folders

In some cases, listed in the table below, custom grids can be "inherited" or made available in child folders. This generally applies to cases like queries that can be run in different containers, not to data structures that are scoped to a single folder. Check the box to Make this grid view available in child folders.

The following table types support grid view inheritance:

Table TypeSupports Inheritance into Child Folders?
QueryYes (Also see Edit Query Properties)
Linked QueriesYes (Also see Edit Query Properties)
ListsNo
DatasetsNo
Query SnapshotsNo
Assay DataYes
Sample TypesYes
Data ClassesYes

Troubleshooting

In a study, why can't I customize my grid to show a particular field from another dataset?

Background: To customize your grid view of a dataset by adding columns from another dataset, it must be possible to join the two datasets. The columns used for a dataset's key influence how this dataset can be joined to other tables. Certain datasets have more than one key column (in other words, a "compound key"). In a study, there are three types of datasets, distinguished by how you set Data Row Uniqueness when you create the dataset:

  • Demographic datasets use only the Participants column as a key. This means that only one row can be associated with each participant in such a dataset.
  • Clinical (or standard) datasets use Participant/Visit (or Timepoint) pairs as a compound key. This means that there can be many rows per participant, but only one per participant at each visit or date.
  • Datasets with an additional key field including assay or sample data linked into the study. In these compound key datasets, each participant can have multiple rows associated with each individual visit; these are uniquely differentiated by another column key, such as a rowID.
Consequences: When customizing the grid for a table, you cannot join in columns from a table with more key columns. For example, if you are looking at a demographics dataset in a study, you cannot join to a clinical dataset because the clinical dataset can have multiple rows per participant, i.e. has more columns in its key. There isn't a unique mapping from a participant in the 'originating' dataset to a specific row of data in the dataset with rows for 'participant/visit' pairs. However, from a clinical dataset, you can join either to demographic or other clinical datasets.

Guidance: To create a grid view combining columns from datasets of different 'key-levels', start with the dataset with more columns in the key. Then select a column from the table with fewer columns in the key. There can be a unique mapping from the compound key to the simpler one - some columns will have repeated values for several rows, but rows will be unique.

How do I recover from a broken view?

It is possible to get into a state where you have a "broken" view, and the interface can't return you to the editor to correct the problem. You may see a message like:

View has errors
org.postgresql.util.PSQLException: … details of error

Suggestions:

  • It may work to log out and back in again, particularly if the broken view has not been saved.
  • Depending on your permissions, you may be able to access the view manager by URL to delete the broken view. The Folder Administrator role (or higher) is required. Navigate to the container and edit the URL to replace "/project-begin.view?" with:
    /query-manageViews.view?
    This will open a troubleshooting page where admins can delete or edit customized views. On this page you will see grid views that have been edited and saved, including a line with no "View Name" if the Default grid has been edited and saved. If you delete this row, your grid will 'revert' any edits and restore the original Default grid.
  • If you are still having issues with the Default view on a grid, try accessing a URL like the following, replacing <my_schema> and <my_table> with the appropriate values:
    /query-executeQuery.view?schemaName=<my_schema>&query.queryName=<my_table>

Related Topics




Saved Filters and Sorts


When you are looking at a data grid, you can sort and filter the data as you wish, but those sorts and filters only persist for your current session on that page. Using the .lastFilter parameter on the URL can preserve the last filter, but otherwise these sorts and filters are temporary.

To create a persistent filter or sort, you can save it as part of a custom grid view. Users with the "Editor" or "Shared View Editor" role (or higher) can share saved grids with other users, including the saved filters and sorts.

Learn the basics of customizing and saving grid views in this topic:

Define a Saved Sort

  • Navigate to the grid you'd like to modify.
  • Select (Grid Views) > Customize Grid
  • Click the Sort tab.
  • Check a box in the Available Fields panel to add a sort on that field.
  • In the Selected Sorts panel, specify whether the sort order should be ascending or descending for each sort applied.
  • Click Save.
  • You may save as a new named grid view or as the default.
  • If you have sufficient permissions, you will also have the option to make it available to all users.

You can also create a saved sort by first sorting your grid directly using the column headers, then opening the grid customizer panel to convert the local sort to a saved one.

  • In the grid view with the saved sort applied above, sort on a second column, in this example we chose 'Lymphs'.
  • Open (Grid Views) > Customize Grid.
  • Click the Sort tab. Note that it shows (2), meaning two sorts are now defined. Until you save the grid view with this additional sort included, it will remain temporary, as sorts usually are.
  • Drag and drop if you want to change the order in which sorts are applied.
  • Remember to Save your grid to save the additional sort.

Define a Saved Filter

The process for defining saved filters is very similar. You can filter locally first or directly define saved filters.

  • An important advantage of using the saved filters interface is that when filtering locally, you are limited to two filters on a given column. Saved filters may include any number of separate filtering expressions for a given column, which are all ANDed together.
  • Another advantage is that there are some additional filtering expressions available here that are not available in the column header filter dialog.
    • For example, you can filter by relative dates using -#d syntax (# days ago) or #d (# days from now) using the grid customizer but not using the column header.
  • Select (Grid Views) > Customize Grid.
  • Click the Filter tab.
  • In the left panel, check boxes for the column(s) on which you want to filter.
  • Drag and drop filters in the right panel to change the filtering order.
  • In the right panel, specify filtering expressions for each selected column. Use pulldowns and value entry fields to set expressions, add more using the (Add) buttons.
  • Use the buttons to delete individual filtering expressions.
  • Save the grid; select whether to make it available to other users.

Apply Grid Filter

When viewing a data grid, you can enable and disable all saved filters and sorts using the Apply Grid Filter checkbox in the (Grid Views) menu. All defined filters and sorts are applied at once using this checkbox - you cannot pick and choose which to apply. If this menu option is not available, no saved filters or sorts have been defined.

Note that clearing the saved filters and sorts by unchecking this box does not change how they are saved; it only clears them for the current user and session.

Interactions Among Filters and Sorts

Users can still perform their own sorting and filtering as usual when looking at a grid that already has a saved sort or filter applied.

  • Sorting: Sorting a grid view while you are looking at it overrides any saved sort order. In other words, the saved sort can control how the data is first presented to the user, but the user can re-sort any way they wish.
  • Filtering: Filtering a grid view which has one or more saved filters results in combining the sets of filters with an AND. That is, new local filters happen on the already-filtered data. This can result in unexpected results for the user in cases where the saved filter(s) exclude data that they are expecting to see. Note that these saved filters are not listed in the filters bar above the data grid, but they can all be disabled by unchecking the (Grid Views) > Apply View Filter checkbox.

Related Topics




Select Rows


When you work with a grid of data, such as a list or dataset, you often need to select one or more rows. For example, you may wish to visualize a subset of data or select particular rows from an assay to link into a study. Large data grids are often viewed as multiple pages, adding selection options.

Topics on this page:

Select Rows on the Current Page of Data

  • To select any single row, click the checkbox at the left side of the row.
  • To unselect the row, uncheck the same checkbox.
  • The box at the top of the checkbox column allows you to select or unselect all rows on the current page at once.

Select Rows on Multiple Pages

Using the and buttons in the top right of the grid, you can page forward and back in your data and select as many rows as you like, singly or by page, using the same checkbox selection methods as on a single page. The selection message will update showing the total tally of selected rows.

To change the number of rows per page, select the row count message ("1 - 100 of 677" in the above screencap) to open a menu. Select Paging and make another selection for rows per page. See Page Through Data for more.

Selection Buttons

Selecting the box at the top of the checkbox column also adds a bar above your grid which indicates the number of rows selected on the current page and additional selection buttons.

  • Select All Selectable Rows: Select all rows in the dataset, regardless of pagination.
  • Select None: Unselect all currently selected rows.
  • Show All: Show all rows as one "page" to simplify sorting and selection.
  • Show Selected: Show only the rows that are selected in a single page grid.
  • Show Unselected: Show only the rows that are not selected in a single page grid.
Using the Show Selected option can be helpful in keeping track of selections in large datasets, but is also needed for some actions which may only apply to rows on the current page which are selected.

Related Topics




Export Data Grid


LabKey provides a variety of methods for exporting the rows of a data grid. You can export into formats that can be consumed by external applications (e.g., Excel) or into scripts that can regenerate the data grid. You can also choose whether to export the entire set of data or only selected rows. Your choice of export format determines whether you get a static snapshot of the data, or a dynamic reflection that updates as the data changes. The Excel and TSV formats supply static snapshots, while scripts allow you to display dynamically updated data.

Premium Feature Available

Subscribers to the Enterprise Edition of LabKey Server can export data with an e-Signature as described in this topic:


Learn more about premium editions

(Export) Menu

  • Click the (Export) button above any grid view to open the export panel.
  • Use tabs to choose between Excel, Text and Script exports, each of which carries a number of appropriate options for that type.

After selecting your options, decribed below, and clicking the Export button, you will briefly see visual feedback that the export is in progress:

Export Column Headers

Both Excel and Text exports allow you to choose whether Column Headers are exported with the data, and if so, what format is used. Options:

  • None: Export the data table with no column headers.
  • Caption: (Default) Include a column header row using the currently displayed column captions as headers.
  • Field Key: Use the column name with FieldKey encoding. While less display friendly, these keys are unambiguous and canonical and will ensure clean export and import of data into the same dataset.

Export Selected Rows

If you select one or more rows using the checkboxes on the left, you will activate the Export Selected Rows checkbox in either Excel or Text export mode. When selected, your exported Excel file will only include the selected rows. Uncheck the box to export all rows. For additional information about selecting rows, see Select Rows.

Filter Data Before Export

Another way to export a subset of data records is to filter the grid view before you export it.

  • Filter Data. Clicking a column header in a grid will open a dialog box that lets you filter and exclude certain types of data.
  • Create or select a Custom Grid View. Custom Grids let you store a selected subset as a named grid view.
  • View Data From One Visit. You can use the Study Navigator to view the grid of data records for a particular visit for a particular dataset. From the Study Navigator, click on the number at the intersection of the desired visit column and dataset row.

Export to Excel

When you export your data grid to Excel, you can use features within that software to access, sort and present the data as required. If your data grid includes inline images they will be exported in the cell in which they appear in the grid.

Export to Text

Select Text tab to export the data grid in a text format. Select tab, comma, colon, or semicolon from the Separator pulldown and single or double from the Quote pulldown. The extension of your exported text file will correspond to the separator you have selected, i.e. "tsv" for tab separators.

LabKey Server uses the UTF-8 character encoding when exporting text files.

Export to Script

You can export the current grid to script code that can be used to access the data from any of the supported client libraries. See Export Data Grid as a Script.

The option to generate a Stable URL for the grid is also included on the (Export) > Script tab.

Related Topics

Premium Feature Available

Subscribers to the Enterprise Edition of LabKey Server can export data with an e-Signature as described in this topic:


Learn more about premium editions




Participant Details View


The default dataset grid displays data for all participants. To view data for an individual participant, click on the participantID in the first column of the grid. The Participant Details View can also be customized to show graphical or other custom views of the data for a single study subject.

Participant Details View

The participant details view lists all of the datasets that contain data for the current participant, as shown in the image below.

  • Previous/Next Participant: Page through participants. Note that this option is only provided when you navigate from a dataset listing other participants. Viewing a single participant from the Participants tab does not include these options.
  • button: Expand dataset details.
  • button: Collapse dataset details

Add Charts

Expand dataset details by clicking the or name of the dataset of interest. Click Add Chart to add a visualization to the participant view details. The dropdown will show the charts defined for this dataset.

After you select a chart from the dropdown, click the Submit button that will appear.

Once you create a chart for one participant in a participant view, the same chart is displayed for every participant, with that participant's data.

You can add multiple charts per dataset, or different charts for each dataset. To define new charts to use in participant views, use the plot editor.

Notes:

1. Charts are displayed at a standard size in the default participant details view; custom height and width, if specified in the chart definition, are overridden.

2. Time charts displaying data by participant group can be included in a participant details view, however the data is filtered to show only data for the individual participant. Disregard the legend showing group names for these trend lines.

Customize Participant Details View

You can alter the HTML used to create the default participant details page and save alternative ways to display the data using the Customize View link.

Click Save to refresh and see the preview below your script. Click Save and Finish when finished.

Example 1: Align Column of Values

In the standard participant view, the column of values for expanded demographics datasets will be centered over the grid of visits for the clinical datasets. In some cases, this will cause the demographic field values to be outside the visible page. To change where they appear, you can add the following to the standard participant details view (under the closing </script> tag):

<style>
.labkey-data-region tr td:first-child {width: 300px}
</style>
Use a wider "width" value if you have very long field names in the first column.

Example 2: Tabbed Participant View

You can see another example of a custom participant view in the online example study. Click the tabs to see the datasets in each category; step through to see data for other participants:


Premium Resource Available

Subscribers to premium editions of LabKey Server can use the example code in this topic to create their own variation of the tabbed example linked above:


Learn more about premium editions

Related Topics




Query Scope: Filter by Folder


This topic is under construction for the 22.11 (November 2022) release of LabKey Server. For current documentation of this feature, click here.

For certain LabKey queries, including assay designs, issue trackers, and survey designs, but not including study datasets, the scope of the query can be set with a folder filter to include:

  • all data on the site
  • all data for a project
  • only data located in particular folders
For example, the query itself can be defined at the project level. Data it queries may be located within individual subfolders. Scope is controlled using the "Filter by Folder" option on the (Grid Views) menu in the web part.

Note that there are many mechanisms for controlling data scope in LabKey Server. To learn about other methods, see Controlling Data Scope.

This allows you to organize your data in folders that are convenient to you at the time of data collection (e.g., folders for individual labs or lab technicians). Then you can perform analyses independently of the folder-based organization of your data. You can analyze data across all folders, or just a branch of your folder tree.

You can set the scope through either the (Grid Views) menu or through the client API. In all cases, LabKey security settings remain in force, so users only see data in folders they are authorized to see.

Folder Filters in Grid Interface

To filter by folder through the user interface:

  • Select (Grid Views) > Filter by Folder.
  • Choose one of the options:
    • Current folder
    • Current folder and subfolders
    • All folders (on the site)

Some types of grid may have different selections available. For example, Sample Types and Data Classes offer the following options to support cross-folder lineage lookups:

  • Current folder
  • Current folder, project, and Shared project

Folder Filters in the JavaScript API

The LabKey API provides developers with even finer grained control over the scope.

The containerFilter config property is available on many methods on such as LABKEY.Query.selectRows, provides fine-grained control over which folders are accessed through the query.

For example, executeSQL allows you to use the containerFilter parameter to run custom queries across data from multiple folders at once. A query might (for example) show the count of NAb runs available in each lab’s subfolder if folders are organized by lab.

Possible values for the containerFilter are:

  • allFolders
  • current
  • currentAndFirstChildren
  • currentAndParents
  • currentAndSubfolders
  • currentPlusProject
  • currentPlusProjectAndShared

Folder Filters with the JDBC Driver

The LabKey JDBC Driver also supports the containerFilter parameter for scoping queries by folder. Learn more in this topic:

Container Filter in LabKey SQL

Within a LabKey SQL FROM clause, you can include a containerFilter to control scope of an individual table within the query. Learn more in this topic:

Related Topics




Reports and Charts


The right visualization can communicate information about your data efficiently. You can create different types of report and chart to view, analyze and display data using a range of tools.

Built-in Chart and Report Types

Basic visualization types available to non-administrative users from the (Charts) > Create Chart menu and directly from data grid column headers are described in this topic:

Additional visualization and report types:

Display Visualizations

Reports and visualizations can be displayed and managed as part of a folder, project or study.

Manage Visualizations

External Visualization Tools (Premium Integrations)

LabKey Server can serve as the data store for external reporting and visualization tools such as RStudio, Tableau, Access, Excel, etc. See the following topics for details.

Related Topics




Jupyter Reports


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey.

This topic describes how to import reports from Jupyter Notebooks and use them with live data in LabKey Server. A docker image is configured that will essentially apply the command 'jupyter nbconvert' to allow generating output files from a Jupyter notebook file. When configured correctly, a Jupyter Report scripting engine can be added and authorized users will see a Jupyter Reports option on the menu of data grids.

Once integrated, users can access the Jupyter report builder to import .ipynb files exported from Jupyter Notebooks. These reports can then be displayed and shared in LabKey.

Set Up Docker Image

Install Docker Desktop for your operating system. For a development system you can choose "Expose Docker Daemon on on tcp://localhost:2375 without TLS". Learn about installing docker in the first sections of this topic:

Next, you'll clone the docker-rstudio project and build the python/nbconvert image. Note that this is not a module and should not be cloned into your LabKey enlistment. In Docker Desktop, confirm that you can see labkey/nbconvert in the list of available Images.

Configure Docker Host on LabKey Server

First, confirm that your LabKey Server includes the docker module. Select (Admin) > Site > Admin Console and check the Module Information tab. Contact your Account Manager if you don't see Docker on the list.

Next, switch to the Settings tab of the Admin Console. Under Premium Features, click Docker Host Settings. Enable the Docker Host and provide the port and other settings as described in this topic:

Click Test Saved Host to confirm the configuration.

Add Docker Report Engine

Still within the Admin Console, under Configuration, click Views and Scripting. Select Add > Docker Report Engine. In the popup, you'll see the default options and can adjust them if needed.

  • Language: Python
  • File extension: ipynb (don't include the . before the extension)
  • Docker Image Name: labkey/nbconvert
  • Enabled: checked
Click Submit to save.

You can now see the Jupyter Reports menu option in the Data Views web part and on the grid menu where you created it.

Obtain Report .ipynb from Jupyter Notebooks

Within Jupyter Notebooks, you can now open and author your report. You can use the LabKey Python API to access data and craft the report you want. One way to get started is to export the desired data in a python script form that you can use to obtain that data via API.

Use this script as the basis for your Jupyter Notebook report.

When ready, you'll "export" the report by simply saving it as an .ipynb file. To save to a location where you can easily find it for importing to LabKey, choose 'Save As' and specify the desired location.

Create Jupyter Report on LabKey Server

From any data grid, or from the Data Views web part, you can add a new Jupyter Report. You'll see a popup asking if you want to:

Import From File

Click Import From File in the popup and browse to select the .ipynb file to open. You'll see your report text on the Source tab of the Jupyter Report Builder.

Click Save to save this report on LabKey. Enter a report name and click OK. Now your report will be accessible from the menu.

Start with Blank Report

Boilerplate image/contents

Jupyter Report Builder

The Report Builder offers tabs for:

  • Report: See the report.
  • Data: View the data on which the report is based.
  • Source: Script source, including Save and Cancel for the report. Options are also available here:
    • Make this report available to all users
    • Show source tab to all users
    • Make this report available in child folders
  • Help: Details about the report configuration options available.

Help Tab - Report Config Properties

When a Jupyter Report is executed, a config file is generated and populated with properties that may be useful to report authors in their script code. The file is written in JSON to : report_config.json. A helper utility : ReportConfig.py is included in the nbconfig image. The class contains functions that will parse the generated file and return configured properties to your script. An example of the code you could use in your script:

from ReportConfig import get_report_api_wrapper, get_report_data, get_report_parameters
print(get_report_data())
print(get_report_parameters())

This is an example of a configuration file and the properties that are included.

{
"baseUrl": "http://localhost:8080",
"contextPath": "/labkey",
"scriptName": "myReport.ipynb",
"containerPath": "/my studies/demo",
"parameters": [
[
"pageId",
"study.DATA_ANALYSIS"
],
[
"reportType",
"ReportService.ipynbReport"
],
[
"redirectUrl",
"/labkey/my%20studies/demo/project-begin.view?pageId=study.DATA_ANALYSIS"
],
[
"reportId",
"DB:155"
]
],
"version": 1
}

Export Jupyter Report from LabKey Server

The Jupyter report in LabKey can also be exported as an .ipynb file for use elsewhere. Open the report, choose the Source tab, and under Jupyter Report Options, click Export.

Related Topics




Report Web Part: Display a Report or Chart


Displaying a report or chart alongside other content helps you highlight visualizations of important results. There are a number of ways to do this, including:

Display a Single Report

To display a report on a page:

  • Enter > Page Admin Mode.
  • Click Add Web Part in the lower left, select Report, and click Add.
  • On the Customize Report page, enter the following parameters:
    • Web Part Title: This is the title that will be displayed in the web part.
    • Report or Chart: Select the report or chart to display.
    • Show Tabs: Some reports may be rendered with multiple tabs showing.
    • Visible Report Sections: Some reports contain multiple sections, such as: images, text, console output. If a list is offered, you can select which section(s) to display by selecting them. If you are displaying an R Report, the sections are identified by the section names from the source script.
  • Click Submit.

In this example, the new web part will look like this:

Change Report Web Part Settings

You can reopen the Customize Report page later to change the name or how it appears.

  • Select Customize from the (triangle) menu.
  • Click Submit.

Options available (not applicable to all reports):

  • Show Tabs: Some reports may be rendered with multiple tabs showing. Select this option to only show the primary view.
  • Visible Report Sections: Some reports contain multiple sections such as: images, text, console output. For these you can select which section(s) to display by selecting them from the list.

Related Topics




Data Views Browser


The Data Views web part displays a catalog of available reports and custom named grid views. This provides a convenient dashboard for selecting among the available ways to view data in a given folder or project. In a Study the Data Views web part also includes datasets. It is shown on the Clinical and Assay Data tab by default, but can be added to other tabs or pages as needed.

By default, the Data Views web part lists all the custom grid views, reports, etc. you have permission to read. If you would like to view only the subset of items you created yourself, click the Mine checkbox in the upper right.

Add and Manage Content

Users with sufficient permission can add new content and make some changes using the menu in the corner of the web part. Administrators have additional options, detailed below.

  • Add Report: Click for a submenu of report types to add.
  • Add Chart: Use the plot editor to add new visualizations.
  • Manage Views: Manage reports and custom grids, including the option to delete multiple items at once. Your permissions will determine your options on this page.
  • Manage Notifications: Subscribe to receive notifications of report and dataset changes.

Toggle Edit Mode

Users with Editor (or higher) permissions can edit metadata about items in the databrowser.

  • Click the pencil icon in the web part border to toggle edit mode. Individual pencil icons show which items have metadata you can edit here.
  • When active, click the pencil icon for the desired item.
  • Edit Properties, such as status, author, visibility to others, etc.
  • If you want to move the item to a different section of the web part, select a different Category.
  • If there are associated thumbnails and mini-icons, you can customize them from the Images tab. See Manage Thumbnail Images for more information.
  • Click Save.

Notice that there are three dates associated with reports: the creation date, the date the report itself was last modified, and the date the content of the report was last modified.

Data Views Web Part Options

An administrator adds the Data Views web part to a page:

  • Enter > Page Admin Mode.
  • Select Data Views from the <Select Web Part> pulldown in the lower left.
  • Click Add.

The menu in the upper corner of the web part gives admins a few additional options:

  • Manage Datasets: Create and manage study datasets.
  • Manage Queries: Open the query schema browser.
  • Customize: Customize this web part.
  • Permissions: Control what permissions a user must have to see this web part.
  • Move Up/Down: Change the sequence of web parts on the page. (Only available when an admin is in page admin mode).
  • Remove From Page: No longer show this web part - note that the underlying data is not affected by removing the web part. (Only available when an admin is in page admin mode).
  • Hide Frame - remove the header and border of the web part. (Only available when an admin is in page admin mode).

Customize the Data Views Browser

Administrators can also customize display parameters within the web part.

Select Customize from the triangle pulldown menu to open the customize panel:

You can change the following:

  • Name: the heading of the web part (the default is "Data Views").
  • Display Height: adjust the size of the web part. Options:
    • Default (dynamic): by default, the data views browser is dynamically sized to fit the number of items displayed, up to a maximum of 700px.
    • Custom: enter the desired web part height. Must be between 200 and 3000 pixels.
  • Sort: select an option:
    • By Display Order: (Default) The order items are returned from the database.
    • Alphabetical: Alphabetize items within categories; categories are explicitly ordered.
  • View Types: Check or uncheck boxes to control which items will be displayed. Details are below.
  • Visible Columns: Check and uncheck boxes to control which columns appear in the web part.
  • Manage Categories: Click to define and use categories and subcategories for grouping.
  • To close the Customize box, select Save or Cancel.

Show/Hide View Types

In the Customize panel, you can check or uncheck the View Types to control what is displayed.

  • Datasets: All study datasets, unless the Visibility property is set to Hidden.
  • Queries: Customized named grid views on any dataset.
    • Note that SQL queries defined in the UI or included in modules are not included when this option is checked.
    • A named grid view created on a Hidden dataset will still show up in the data views browser when "queries" are shown.
    • Named grid views are categorized with the dataset they were created from.
    • You can also create a custom XML-based query view in a file-based module using a .qview.xml file. To show it in the Data Views web part, use the showInDataViews property and enable the module in the study where you want to use it.
  • Queries (inherited): This category will show grid views defined in a parent container with the inherited box checked.
  • Reports: All reports in the container.
    • To show a query in the data views web part, you can create a Query Report to do so.
    • It is good practice to name such a report the same as the query and include details in the description field for later retrieval.

Related Topics




Query Snapshots


A query snapshot captures a data query at a moment in time. The data in the snapshot will remain fixed even if the original source dataset is updated until/unless it is refreshed manually or automatically. You can refresh the resulting query snapshot manually or set up a refresh schedule. If you choose automatic refresh, the system will listen for changes to the original data, and will update the snapshot within the interval of time you select.

Snapshotting data in this fashion is only available for:

  • Study datasets
  • Linked datasets from assays and sample types
  • User-defined SQL queries
Queries exposed by linked schemas are not available for snapshotting.

Create a Query Snapshot

  • Go to the query, grid, or dataset you wish to snapshot.
  • Select (Charts/Reports) > Create Query Snapshot.
  • Name the snapshot (the default name appends the word "Snapshot" to the name of the grid you are viewing).
  • Specify Manual or Automatic Refresh. For automatic refresh, select the frequency from the dropdown. When data changes, the snapshot will be updated within the selected interval of time. Options:
    • 30 seconds
    • 1 minute
    • 5 minutes
    • 10 minutes
    • 30 minutes
    • 1 hour
    • 2 hours
  • If you want to edit the properties or fields in the snapshot, click Edit Dataset Definition to use the Dataset Designer before creating your snapshot.
    • Be sure to save (or cancel) your changes in the dataset designer to return to the snapshot creation UI.
  • Click Create Snapshot.

View a Query Snapshot

Once a query snapshot has been created it is available in the data browser and at (Admin) > Manage Study > Manage Datasets.

Edit a Query Snapshot

The fields for a query snapshot, as well as the refresh policy and frequency can be edited starting from the grid view.

  • Select (Grid Views) > Edit Snapshot.
  • You can see the name and query source here, but cannot edit them.
  • If you want to edit the fields in the snapshot, (such as to change a column label, etc.) click Edit Dataset Definition to use the Dataset Designer. You cannot change the snapshot name itself. Be sure to save (or cancel) your changes in the dataset designer to return to the snapshot creation UI.
  • Change the Snapshot Refresh settings as needed.
  • Click Update Snapshot to manually refresh the data now.
  • Click Save to save changes to refresh settings.
  • Click Done to exit the edit interface.

The Edit Snapshot interface also provides more options:

  • Delete Snapshot
  • Show History: See the history of this snapshot.
  • Edit Dataset Definition: Edit fields and their properties for this snapshot. Note that you must either save or cancel your changes in the dataset designer in order to return to the snapshot editor.

Related Topics




Attachment Reports


Attachment reports enable you to upload and attach stand-alone documents, such as PDF, Word, or Excel files. This gives you a flexible way to interconnect your information.

You can create a report or visualization using a statistical or reporting tool outside of LabKey, then upload the report directly from your local machine, or point to a file elsewhere on the server.

Add an Attachment Report

To upload an attachment report, follow these steps:

  • Create the desired report and save it to your local machine.
  • Open the (triangle) menu in the Data Views web part.
  • Select Add Report > Attachment Report.
  • Provide the name, date, etc. for the report.
  • Upload the report from your local machine (or point to a document already on the server).

Once the file is uploaded it will be shown in the data browser. If you specify that it is to be shared, other users can view and download it.

If the report was saved in the external application with an embedded JPEG thumbnail, LabKey Server can in some cases extract that and use it as a preview in the user interface. See Manage Thumbnail Images for more information.

Related Topics




Link Reports


Add links to external resources using a Link Report. This flexible option allows you to easily connect your data to other resources.

Create a Link Report

  • Open the (triangle) menu in the Data Views web part.
  • Select Add Report > Link Report.
  • Complete the form. Link to an external or internal resource. For example, link to an external website or to a page within the same LabKey Server.

Related Topics




Participant Reports


With a LabKey study, creating a participant report lets you show data for one or more individual participants for selected measures. Measures from different datasets in the study can be combined in a single report. A filter panel lets you dynamically change which

Create a Participant Report

  • In a study, select (Admin) > Manage Views.
  • Choose Add Report > Participant Report.
  • Click Choose Measures, select one or more measures, then click Select.
  • Enter a Report Name and optional Report Description.
  • Select whether this report will be Viewable By all readers or only yourself.
  • When you first create a report, you will be in "edit mode" and can change your set of chosen measures. Below the selection panel, you will see partial results.
    • Toggle the edit panel by clicking the (pencil) icon at the top of the report to see more results; you may reopen it at any time to further edit or save the report.
  • Click the Filter Report (chevron) button to open the filter panel to refine which participants appear in the report.
  • Select desired filters using the radio buttons and checkboxes. You may hide the filter panel with the , or if you click the X to close it entirely, a Filter Report link will appear on the report menu bar.
  • Click the Transpose button to flip the columns and rows in the generated tables, so that columns are displayed as rows and vice versa, as shown below.
  • When you have created the report you want, click Save.
  • Your new report will appear in the Data Views.

Export to Excel File

  • Select Export > To Excel.

Add the Participant Report as a Web Part

  • Enter > Page Admin Mode.
  • Select Report from the <Select Web Part>, select Report.
  • Name the web part, and select the participant report you created above.
  • Click Submit.
  • Click Exit Admin Mode.

Related Topics




Query Reports


Query reports let you package a query or grid view as a report. This can enable sharing the report with a different audience or presenting the query as of a specific data cut date. A user needs the "Author" role or higher to create a query report. The report also requires that the target query already exists.

Create a Query Report

  • Select (Admin) > Manage Views.
  • Select Add Report > Query Report.
Complete the form, providing:
  • Name (Required): The report name.
    • Consider naming the report for the query (or grid view) it is "reporting" to make it clearer to the user.
  • Author: Select from all project users listed in the dropdown.
  • Status: Choose one of "None, Draft, Final, Locked, Unlocked".
  • Data Cut Date: Specify a date if desired.
  • Category: If you are creating a report in a study, you can select an existing category.
  • Description: Optional text description.
    • Consider including details like the origin of the report (schema/query/view) so that you can easily search for them later.
  • Shared: Check the box if you want to share this report with other users.
  • Schema (Required): The schema containing the desired query. This choice will populate the Query dropdown.
  • Query (Required): Select a query from those in the selected schema. This will populate the View dropdown.
  • View: If there are multiple views defined on the selected query, you'll be able to choose one here.
  • Click Save when finished.

You can customize the thumbnail and mini-icon displayed with your Query Report. Learn more here.

Use and Share Query Reports

Your report is now available for display in a web part or wiki, or sharing with others. Query Reports can be assigned report-specific permissions in a study following the instructions in this topic.

In a study, your report will appear in the Data Views web part under the selected category (or as "Uncategorized"). If you want to hide a Query Report from this web part, you can edit the view metadata to set visibility to "Hidden".

Manage Query Reports

Metadata including the name, visibility, data cut date, category and description of a Query Report can all be edited through the manage views interface.

To see the details of the schema, query, and view used to create the report, you can view the Report Debug Information page. Navigate to the report itself, you will see a URL like the following, where ### is the id number for your report:

/reports-renderQueryReport.view?reportId=db%3A###

Edit the URL to replace renderQueryReport.view with reportInfo.view, as so:

/reports-reportInfo.view?reportId=db%3A###

Related Topics




Manage Data Views


Reports, charts, datasets, and customized data grids are all ways to view data in a folder and can be displayed in a Data Views web part. Within a study, this panel is displayed on the Clinical and Assay Data tab by default, and can be customized or displayed in other places as needed. This topic describes how to manage these "data views".

Manage Views

Select (Admin) > Manage Views in any folder. From the Data Views web part, you can also select it from the (triangle) pulldown menu.

The Manage Views page displays all the views, queries, and reports available within a folder. This page allows editing of metadata as well as deletion of multiple items in one action.

  • A row of links are provided for adding, managing, and deleting views and attributes like categories and notifications.
  • Filter by typing part of the name, category, type, author, etc. in the box above the grid.
  • By default you will see all queries and reports you can edit. If you want to view only items you created yourself, click the Mine checkbox in the upper right.
  • Hover over the name of an item on the list to see a few details, including the type, creator, status, and an optional thumbnail.
  • Click on the name to open the item.
  • Click a Details link to see more metadata details.
  • Notice the icons to the right of charts, reports, and named grids. Click to edit the metadata for the item.
  • When managing views within a study, you can click an active link in the Access column to customize permissions for the given visualization. "Public" in this column refers to the item being readable by users with at least Read access to the container, not to the public at large unless that is separately configured. For details see Configure Permissions for Reports & Views.

View Details

Hover over a row to view the source and type of a visualization, with a customizable thumbnail image.

Clicking the Details icon for a report or chart opens the Report Details page with the full list of current metadata. The details icon for a query or named view will open the view itself.

Modification Dates

There are two modification dates associated with each report, allowing you to differentiate between report property and content changes:

  • Modified: the date the report was last modified.
    • Name, description, author, category, thumbnail image, etc.
  • Content Modified: the date the content of the report was modified.
    • Underlying script, attachment, link, chart settings, etc.
The details of what constitutes content modification are report specific:
  • Attachment Report:
    • Report type (local vs. server) changed
    • Server file path updated
    • New file attached
  • Box Plot, Scatter Plot, Time Chart:
    • Report configuration change (measure selection, grouping, display, etc.)
  • Link Report:
    • URL changed
  • Script Reports including JavaScript and R Reports:
    • Change to the report code (JavaScript, R, etc.)
  • Flow Reports (Positivity and QC):
    • Change to any of the filter values
The following report types do not change the ContentModified date after creation: Crosstab View, DataReport, External Report, Query Report, Chart Reports, Chart View.

Edit View Metadata

Click the pencil icon next to any row to edit metadata to provide additional information about when, how, and why the view or report was created. You can also customize how the item is displayed in the data views panel.

View Properties

Click the pencil icon to open a popup window for editing visualization metadata. On the Properties tab:

  • Modify the Name and Description fields.
  • Select Author, Status, and Category from pulldown lists of valid values. For more about categories, see Manage Categories.
  • Choose a Data Cut Date from the calendar.
  • Check whether to share this report with all users with access to the folder.
  • Click Save.

The Images tab is where you modify thumbnails and mini-icons used for the report.

You could also delete this visualization from the Properties tab by clicking Delete. This action is confirmed before the view is actually deleted.

View Thumbnails and Mini-icons

When a visualization is created, a default thumbnail is auto-generated and a mini-icon based on the report type is associated with it. You can see and update these on the Images tab. Learn more about using and customizing these images in this topic:

Reorder Reports and Charts

To rearrange the display order of reports and charts, click Reorder Reports and Charts. Users without administrator permissions will not see this button or be able to access this feature.

Click the heading "Reports and Charts" to toggle searching ascending or decending alphabetically. You can also drag and drop to arrange in any order.

When the organization is correct, click Done.

File based reports can be moved within the dialog box, but the ordering will not actually change until you make changes to their XML.

Delete Views and Reports

Select any row by clicking an area that is not a link. You can use Shift and Ctrl to multi-select several rows at once. Then click Delete Selected. You will be prompted to confirm the list of the items that will be deleted.

Manage Study Notifications

Users can subscribe to a daily digest of changes to reports and datasets in a study. Learn more in this topic: Manage Study Notifications

Related Topics




Manage Study Notifications


If you want to receive email notifications when the reports and/or datasets in a study are updated, you can subscribe to a daily digest of changes. These notifications are similar to email notifications for messages and file changes at the folder level, but allow finer control of which changes trigger notification.

Manage Study Notifications

  • Select Manage Notifications from the (triangle) menu in the Data Views web part.
    • You can also select > Manage Views then click Manage Notifications.
  • Select the desired options:
  • None. (Default)
  • All changes: Your daily digest will list changes and additions to all reports and datasets in this study.
  • By category: Your daily digest will list changes and additions to reports and datasets in the subscribed categories.
  • By dataset: Your daily digest will list changes and additions to subscribed datasets. Note that reports for those datasets are not included in this option.
  • Click Save.

You will receive your first daily digest of notifications at the next system maintenance interval, typically overnight. By default, the notification includes the list of updated reports and/or datasets including links to each one.

Subscribe by Category

You can subscribe to notifications of changes to both reports and datasets by grouping them into categories of interest. If you want to allow subscription to notifications for a single report, create a singleton subcategory for it.

Select the By Category option, then click checkboxes under Subscribe for the categories and subcategories you want to receive notifications about.

Subscribe by Dataset

To subscribe only to notifications about specific datasets, select the By dataset option and use checkboxes to subscribe to the datasets of interest.

Note that reports based on these datasets are not included when you subscribe by dataset.

Notification Triggers

The following table describes which data view types and which changes trigger notifications:

 Data Insert/Update/DeleteDesign ChangeSharing Status ChangeCategory ChangeDisplay Order Change
Datasets (including linked Datasets)YesYesYesYesYes
Query SnapshotYes (Notification occurs when the snapshot is refreshed)YesYesYesYes
Report (R, JavaScript, etc.)NoYesYesYesYes
Chart (Bar, Box Plot, etc.)NoYesYesYesYes

Datasets:

  • Changes to both the design and the underlying data will trigger notifications.
Reports:
  • Reports must be both visible and shared to trigger notifications.
  • Only changes to the definition or design of the report, such as changes to the R or JS code, will trigger notifications.
Notifications are generated for all items when their sharing status is changed, their category is changed, or when their display order is changed.

Customize Email Notification

By default, the notification includes the list of updated reports and/or datasets including links to each one. The email notification does not describe the nature of the change, only that some change has occurred.

The template for these notifications may be customized at the site-level, as described in Email Template Customization.

Related Topics




Manage Categories


In the Data Views web part, reports, visualizations, queries, and datasets can be sorted by categories and subcategories that an administrator defines. Users may also subscribe to notifications by category.

Categories can be defined from the Manage Categories page, defined during dataset creation or editing, or during the process of linking data into a study from an assay, or sample type.

Define Categories

Manage Categories Interface

  • In your study, select (Admin) > Manage Views.
  • Click Manage Categories to open the categories pop-up.

Click New Category to add a category; click the X to delete one, and drag and drop to reorganize sections of reports and grid views.

To see subcategories, select a category in the popup. Click New Subcategory to add new ones. Drag and drop to reorder. Click Done in the category popup when finished.

Dataset Designer Interface

During creation or edit of a dataset definition, you can add new categories by typing the new category into the Category box. Click Create option "[ENTERED_NAME]" to create the category and assign the current dataset to it.

Assign Items to Categories

To assign datasets, reports, charts, etc to categories and subcategories, click the (pencil) icon on the manage views page, or in "edit" mode of the data browser. The Category menu will list all available categories and subcategories. Make your choice and click Save.

To assign datasets to categories, use the Category field on the dataset properties page. Existing categories will be shown on the dropdown.

Hide a Category

Categories are only shown in the data browser or dataset properties designer when there are items assigned to them. Assigning all items to a different category or marking them unassigned will hide the category. You can also mark each item assigned inside of it as "hidden". See Data Views Browser.

Categorize Linked Assay and Sample Data

When you link assay or sample data into a study, a corresponding Dataset is created (or appended to if it already exists).

During the manual link to study process, or when you add auto-linking to the definition of the assay or sample type, you can Specify Linked Dataset Category.

  • If the study dataset you are linking to already exists and already has a category assignment, this setting will not override the previous categorization. You can manually change the dataset's category directly.
  • If you are creating a new linked dataset, it will be added to the category you specify.
  • If the category does not exist, it will be created.
  • Leave blank to use the default "Uncategorized" category. You can manually assign or change a category later.
Learn more about linking assays and samples to studies in these topics:

Related Topics




Manage Thumbnail Images


When a visualization is created, a default thumbnail is automatically generated and a mini-icon based on the report or chart type is associated with it. These are displayed in the data views web part. You can customize both to give your users a better visual indication of what the given report or chart contains. For example, rather than have all of your R reports show the default R logo, you could provide different mini-icons for different types of content that will be more meaningful to your users.

Attachment Reports offer the additional option to extract the thumbnail image directly from some types of documents, instead of using an auto-generated default.

View and Customize Thumbnails and Mini-icons

To view and customize images:
  • Enter Edit Mode by clicking the pencil icon in the data views browser or on the manage views page.
  • Click the pencil icon for any visualization to open the window for editing metadata.
  • Click the Images tab. The current thumbnail and mini-icon are displayed, along with the option to upload different ones from your local machine.
    • A thumbnail image will be scaled to 250 pixels high.
    • A mini-icon will be scaled to 18x18 pixels.
  • The trash can button will delete the default generated thumbnail image, replacing it with a generic image.
  • If you have customized the thumbnail, the trash can button, deletes it and returns the default, generated thumbnail image.
  • Click Save to save any changes you make.

You may need to refresh your browser after updating thumbnails and icons. If you later change and resave the visualization, or export and reimport it with a folder or study, the custom thumbnails and mini-icons will remain associated with it unless you explicitly change them again.

Extract Thumbnails from Documents

An Attachment Report is created by uploading an external document. Some documents can have embedded thumbnails included, and LabKey Server can in some cases extract those thumbnails to associate with the attachment report.

The external application, such as Word, Excel, or PowerPoint, must have the "Save Thumbnail" option set to save the thumbnail of the first page as an extractable jpeg image. When the Open Office XML format file (.docx, .pptx, .xlsx) for an attachment report contains such an image, LabKey Server will extract it from the uploaded file and use it as the thumbnail.

Images in older binary formats (.doc, .ppt, .xls) and other image formats, such as EMF or WMF, will not be extracted; instead the attachment report will use the default auto-generated thumbnail image.

Related Topics




Measure and Dimension Columns


Measures are values that functions work on, generally numeric values such as instrument readings. Dimensions are qualitative properties and categories, such as cohort, that can be used in filtering or grouping those measures. LabKey Server can be configured such that only specific columns marked as data "measures" or "dimensions" are offered for charting. When this option is disabled, all numeric columns are available for charting and any column can be included in filtering.

This topic explains how to mark columns as measures and dimensions, as well as how to enable and disable the restriction.

Definitions:

  • Dimension: "dimension" means a column of non-numerical categories that can be included in a chart, such as for grouping into box plots or bar charts. Cohort and country are examples of dimensions.
  • Measure: A column with numerical data. Instrument readings, viral loads, and weight are all examples of measures.
Note: Text columns that include numeric values can also be marked as measures. For instance, a text column that includes a mix of integers and some entries of "<1" to represent values that are below the lower limit of quantitation (LLOQ) could be plotted ignoring the non numeric entries. The server will make a best effort to convert the data to numeric values and display a message about the number of values that cannot be converted.

If your server restricts charting to only measures and dimensions, you have two options: (1) either mark the desired column as a measure/dimension or (2) turn off the restriction.

Mark a Column as a Measure or Dimension

Use the Field Editor, Advanced Settings to mark columns. The method for opening the field editor varies based on the data structure type, detailed in the topic: Field Editor.

  • Open the field editor.
  • Open the field you want to mark using the expansion icon on the right.
  • Click Advanced Settings.
  • Check the box or boxes desired:
    • "Make this field available as a measure"
    • "Make this field available as a dimension"
  • Click Apply, then Finish to save the changes.

Turn On/Off Restricting Charting to Measures and Dimensions

Note that you must have administrator permissions to change these settings. You can control this option at the site or project level.

  • To locate this control at the site level:
    • Select (Admin) > Site > Admin Console.
    • Under Configuration, click Look and Feel Settings.
  • To locate this control at the project level:
    • Select (Admin) > Folder > Project Settings.
  • Confirm that you are viewing the Properties tab (it opens by default).
  • Scroll down to Restrict charting columns by measure and dimension flags.
  • Check or uncheck the box as desired.
  • Click Save.

Related Topics




Legacy Reports


These are legacy reports that are no longer being actively developed.



Advanced Reports / External Reports


These are legacy reports that are no longer being actively developed.

Advanced Reports (aka External Reports)

This feature is available to administrators only.

An "Advanced Report" lets you launch a command line program to process a dataset. Advanced reports maximize extensibility; anything you can do from the command line you can do via an advanced report.

You use substitution strings (for the data file and the output file) to pass instructions to the command line. These substitution strings describe where to get data and where to put data.

Access the External Report Builder

  • First, navigate to the data grid of interest.
  • Select (Reports) > Create Advanced Report.
  • You will now see the External Report Builder page.
    • Select the Dataset/Query from the pulldown.
    • Define the Program and Arguments using substitution strings as needed.
    • Select the Output File Type (txt, tsv, jpg, gif, png).
    • Click Submit.
    • Enter a name and select the grid from which you want to access this custom report.
    • Click Save
The code entered will be invoked by the user who is running the LabKey Server installation. The current directory will be determined by LabKey Server.

Use Substitution Strings

The External Report Builder lets you invoke any command line to generate the report. You can use the following substitution strings in your command line to identify the data file that contains the source dataset and the report file that will be generated.

  • ${DATA_FILE} This is the file where the data will be provided in tab delimited format. LabKey Server will generate this file name.
  • ${REPORT_FILE} If your process returns data in a file, it should use the file name substituted here. For text and tab-delimited data, your process may return data via stdout instead of via a file. You must specify a file extension for your report file even if the result is returned via stdout. This allows LabKey to format the result properly.

Example

This simple example outputs the content of your dataset to standard output (using the cmd shell in Windows).

  • Open the data grid you want to use.
  • Select (Reports) > Create Advanced Report.
  • Select the Dataset/Query from the dropdown (in this example, we use the Physical Exam dataset).
  • In the Program field, type:
C:\Windows\System32\cmd.exe
  • In the Arguments field, type:
/C TYPE ${DATA_FILE}
  • Select an Output File Type (in this example, .txt)
  • Click Submit. Since we did not name a ${REPORT_FILE} in the arguments, the contents of the dataset will be printed to stdout and appear in this window.
  • Scroll all the way down, enter a name for the new custom report (TypeContents in this example).
  • Select the dataset where you would like to store this report (Physical Exam in this example).
  • Click Save.

You can reopen this report from the data browser and in this example, the generated report will look something like this:




Legacy Chart Views


The legacy chart views shown in this topic are no longer supported. If you have older chart views of these types saved on LabKey Server, they will be converted to JavaScript charts beginning with release 19.1.x. No action is needed to migrate your older charts to the new chart types.

Note that if you were using the LABKEY.Chart API to create such charts on the fly, this API has been removed with release 19.1.x. Instead, use the LABKEY.vis API to create these types of visualization.

For current features supporting the creation of similar visualizations, see:

Legacy Chart View Examples

XY Scatter Plots

Time Series Plots

Participant Charts

Ordinary charts display all selected measurements for all participants on a single plot. Participant charts display participant data on a series of separate charts.

Related Topics




Deprecated: Crosstab Reports


This is a legacy report type that is no longer being actively developed.

A Crosstab Report displays a roll-up of two-dimensional data.

To create a Crosstab Report:

  • Select (Reports) > Create Crosstab Report.
  • Pick a source dataset and whether to include a particular visit or all visits.
  • Then specify the row and column of the source dataset to use, the field for which you would like to see statistics, and the statistics to compute for each row displayed.
Once a Crosstab Report is created, it can be saved and associated with a specific dataset by selecting the dataset name from the dropdown list at the bottom of the page. Once saved, the report will be available in the Reports dropdown list above the data grid.

Related Topics




Visualizations


Visualizations

You can visualize, analyze and display data using a range of plotting and reporting tools. The right kind of image can illuminate scientific insights in your results far more easily than words and numbers alone. The topics in this section describe how to create different kinds of chart using the common plot editor.

Plot Editor

When viewing a data grid, select (Charts) > Create Chart menu to open the plot editor and create new:

Column Visualizations

To generate a quick visualization on a given column in a dataset, select an option from the column header:

Open Saved Visualizations

Once created and saved, visualizations will be re-generated by re-running their associated scripts on live data. You can access saved visualizations either through the (Reports or Charts) pulldown menu on the associated data grid, or directly by clicking on the name in the Data Views web part.

Related Topics




Bar Charts


A bar plot is a visualization comparing measurements of numeric values across categories. The relative size of the bar indicates the relationship between the variable for groups like cohorts in a study.

Create a Bar Chart

  • Navigate to the data grid you want to visualize.
  • Select (Charts) > Create Chart to open the editor. Click Bar (it is selected by default).
  • The columns eligible for charting from your current grid view are listed.
  • Select the column of data to use for separating the data into bars and drag it to the X Axis Categories box.
  • Only the X Axis Categories field is required to create a basic bar chart. By default, the height of the bar shows the count of rows matching each value in the chosen category. In this case the number of participants from each country.
  • To use a different metric for bar height, select another column (here "Lymphs") and drag it to the box for the Y Axis column. Notice that you can select the aggregate method to use. By default, SUM is selected and the label reads "Sum of Lymphs". Here we change to "Mean"; the Y Axis label will update automatically.
  • Click Apply.

Bar Chart Customizations

  • To remove the large number of "blank" values from your chart:
    • Click View Data.
    • Click the Country column header, then select Filter.
    • Click the checkbox for "Blank" to deselect it.
    • Click OK in the popup.
    • Click View Chart to return to the chart which is now missing the 'blank' bar.

To make a more complex grouped bar chart, we'll add data from another column.

  • Click Chart Type to reopen the creation dialog.
  • Drag a column to the Split Categories By selection box, here "Gender".
  • Click Apply to see grouped bars. The "Split" category is now shown along the X axis with a colored bar for each value in the "X Axis Categories" selection chosen earlier.
  • Further customize your visualization using the Chart Type and Chart Layout links in the upper right.
  • Chart Type reopens the creation dialog allowing you to:
    • Change the "X Axis Categories" column (hover and click the X to delete the current election).
    • Remove or change the Y Axis metric, the "Split Categories By" column, or the aggregation method.
    • You can also drag and drop columns between selection boxes to change how each is used.
    • Note that you can also click another chart type on the left to switch how you visualize the data with the same axes when practical.
    • Click Apply to update the chart with the selected changes.

Change Layout

  • Chart Layout offers the ability to change the look and feel of your chart.

There are 3 tabs:

    • General:
      • Provide a Title to show above your chart. By default, the dataset name is used; at any time you can return to this default by clicking the (refresh) icon for the field.
      • Provide a Subtitle to print under the chart title.
      • Specify the width and height.
      • You can also customize the opacity, line width, and line color for the bars.
      • Select one of three palettes for bar fill colors: Light, Dark, or Alternate. The array of colors is shown.
    • Margins (px): If the default chart margins cause axis labels to overlap, or you want to adjust them for other reasons, you can specify them explicitly in pixels. Specify any one or all of the top, bottom, left, and right margins. See an example here.
    • X-Axis/Y-Axis:
      • Label: Change the display labels for the axis (notice this does not change which column provides the data). Click the icon to restore the original label based on the column name.
      • For the Y-axis, the Range shown can also be specified - the default is Automatic across charts. Select Automatic Within Chart to have the range based only on this chart. You can also select Manual and specify the min and max values directly.
  • Click Apply to update the chart with the selected changes.

Save and Export Charts

  • When your chart is ready, click Save.
  • Name the chart, enter a description (optional), and choose whether to make it viewable by others. You will also see the default thumbnail which has been auto-generated, and can choose whether to use it. As with other charts, you can later attach a custom thumbnail if desired.

Once you have created a bar chart, it will appear in the Data Browser and on the (charts) menu for the source dataset. You can manage metadata about it as described in Manage Data Views.

Export Chart

Hover over the chart to reveal export option buttons in the upper right corner:

Export your completed chart by clicking an option:

  • PDF: generate a PDF file.
  • PNG: create a PNG image.
  • Script: pop up a window displaying the JavaScript for the chart which you can then copy and paste into a wiki. See Tutorial: Visualizations in JavaScript for a tutorial on this feature.

Videos

Related Topics




Box Plots


A box plot, or box-and-whisker plot, is a graphical representation of the range of variability for a measurement. The central quartiles (25% to 75% of the full range of values) are shown as a box, and there are line extensions ('whiskers') representing the outer quartiles. Outlying values are typically shown as individual points.

Create a Box Plot

  • Navigate to the data grid you want to visualize.
  • Select (Charts) > Create Chart to open the editor. Click Box.
  • The columns eligible for charting from your current grid view are listed.
  • Select the column to use on the Y axis and drag it to the Y Axis box.

Only the Y Axis field is required to create a basic single-box plot, but there are additional options.

  • Select another column (here "Study: Cohort") and choose how to use this column:
    • X Axis Categories: Create a plot with multiple boxes along the x-axis, one per value in the selected column.
    • Color: Display values in the plot with a different color for each column value. Useful when displaying all points or displaying outliers as points.
    • Shape: Change the shape of points based on the value in the selected column.
  • Here we make it the X-Axis Category and click Apply to see a box plot for each cohort.
  • Click View Data to see, filter, or export the underlying data.
  • Click View Chart to return. If you applied any filters, you would see them immediately reflected in the plot.

Box Plot Customizations

  • Customize your visualization using the Chart Type and Chart Layout links in the upper right.
  • Chart Type reopens the creation dialog allowing you to:
    • Change any column selection (hover and click the X to delete the current election). You can also drag and drop columns between selection boxes to change positions.
    • Add new columns, such as to group points by color or shape. For example, choose: "Country" for Color and "Gender" for Shape.
    • Click Apply to see your changes and switch dialogs.
  • Chart Layout offers options to change the look of your chart, including these changes to make our color and shape distinctions clearer:
    • Set Show Points to All AND:
    • Check Jitter Points to spread the points out horizontally.
    • Click Apply to update the chart with the selected changes.
Here we see a plot with all data shown as points, jittered to spread them out colors vary by country and points are shaped based on gender. Notice the legend in the upper right. You may also notice that the outline of the overall box plot has not changed from the basic fill version shown above. This enhanced chart is giving additional information without losing the big picture of the basic plot.

Change Layout

  • Chart Layout offers the ability to change the look and feel of your chart.

There are 4 tabs:

  • General:
    • Provide a Title to show above your plot. By default, the dataset name is used, and you can return to this default at an time by clicking the refresh icon.
    • Provide a Subtitle to show below the title.
    • Specify the width and height.
    • Elect whether to display single points for all data, only for outliers, or not at all.
    • Check the box to jitter points.
    • You can also customize the colors, opacity, width and fill for points or lines.
    • Margins (px): If the default chart margins cause axis labels to overlap, or you want to adjust them for other reasons, you can specify them explicitly in pixels. Specify any one or all of the top, bottom, left, and right margins. See an example here.
  • X-Axis:
    • Label: Change the display label for the X axis (notice this does not change which column provides the data). Click the icon to restore the original label based on the column name.
  • Y-Axis:
    • Label: Change the display label for the Y axis as for the X axis.
    • Scale Type: Choose log or linear scale for the Y axis.
    • Range: Let the range be determined automatically or specify a manual range (min/max values) for the Y axis.
  • Developer: Only available to users that have the "Platform Developers" site role.
    • A developer can provide a JavaScript function that will be called when a data point in the chart is clicked.
    • Provide Source and click Enable to enable it.
    • Click the Help tab for more information on the parameters available to such a function.
    • Learn more below.
Click Apply to update the chart with the selected changes.

Save and Export Plots

  • When your chart is ready, click Save.
  • Name the plot, enter a description (optional), and choose whether to make it viewable by others. You will also see the default thumbnail which has been auto-generated. You can elect None. As with other charts, you can later attach a custom thumbnail if desired.

Once you have created a box plot, it will appear in the Data Browser and on the (charts) menu for the source dataset. You can manage metadata about it as described in Manage Data Views.

Export Chart

Hover over the chart to reveal export option buttons in the upper right corner:

Export your completed chart by clicking an option:

  • PDF: generate a PDF file.
  • PNG: create a PNG image.
  • Script: pop up a window displaying the JavaScript for the chart which you can then copy and paste into a wiki. See Tutorial: Visualizations in JavaScript for a tutorial on this feature.

Rules Used to Render the Box Plot

The following rules are used to render the box plot. Hover over a box to see a pop-up.

  • Min/Max are the highest and lowest data points still within 1.5 of the interquartile range.
  • Q1 marks the lower quartile boundary.
  • Q2 marks the median.
  • Q3 marks the upper quartile boundary.
  • Values outside of the range are considered outliers and are rendered as dots by default. The options and grouping menus offer you control of whether and how single dots are shown.

Developer Extensions

Developers (users with the "Platform Developers" role) can extend plots that display points to run a JavaScript function when a point is clicked. For example, it might show a widget of additional information about the specific data that point represents. Supported for box plots, scatter plots, line plots, and time charts.

To use this function, open the Chart Layout editor and click the Developer tab. Provide Source in the window provided, click Enable, then click Save to close the panel.

Click the Help tab to see the following information on the parameters available to such a function.

Your code should define a single function to be called when a data point in the chart is clicked. The function will be called with the following parameters:

  • data: the set of data values for the selected data point. Example:
    {
    YAxisMeasure: {displayValue: "250", value: 250},
    XAxisMeasure: {displayValue: "0.45", value: 0.45000},
    ColorMeasure: {value: "Color Value 1"},
    PointMeasure: {value: "Point Value 1"}
    }
  • measureInfo: the schema name, query name, and measure names selected for the plot. Example:
    {
    schemaName: "study",
    queryName: "Dataset1",
    yAxis: "YAxisMeasure",
    xAxis: "XAxisMeasure",
    colorName: "ColorMeasure",
    pointName: "PointMeasure"
    }
  • clickEvent: information from the browser about the click event (i.e. target, position, etc.)

Video

Related Topics




Line Plots


A line plot tracks the value of a measurement across a horizontal scale, typically time. It can be used to show trends alone and relative to other individuals or groups in one plot.

Create a Line Plot

  • Navigate to the data grid you want to visualize. We will use the Lab Results dataset from the example study for this walkthrough.
  • Select (Charts) > Create Chart. Click Line.
  • The columns eligible for charting from your current grid view are listed.
  • Select the X Axis column by drag and drop, here "Date".
  • Select the Y Axis column by drag and drop, here "Lymphs".
  • Only the X and Y Axes are required to create a basic line plot. Other options will be explored below.
  • Click Apply to see the basic plot.

This basic line chart plots a point for every Lymph value measured for each date, as in a scatter plot, then draws a line between them. When all values for all participants are mixed, this data isn't necessarily useful. Next, we might want to separate by participant to see if any patterns emerge for individuals.

You may also notice the labels for tickmarks along the X axis overlap the "Date" label. We will fix that below after making other plot changes.

Line Plot Customizations

Customize your visualization using the Chart Type and Chart Layout links in the upper right.

  • Chart Type reopens the creation dialog allowing you to:
    • Change the X or Y Axis column (hover and click the X to delete the current selection).
    • Select a Series column (optional). The series measure is used to split the data into one line per distinct value in the column.
    • Note that you can also click another chart type on the left to switch how you visualize the data with the same axes when practical.
  • For this walkthrough, drag "Participant ID" to the Series box.
  • Click Apply.

Now the plot draws series' lines between values for the same subject, but is unusably dense. Let's filter to a subset of interest.

  • Click View Data to see and filter the underlying data.
  • Click the ParticipantID column header and select Filter.
    • Click the "All" checkbox in the popup to unselect all values. Then, check the boxes for the first 6 participants, 101-606.
    • Click OK.
  • Click View Chart to return. Now there are 6 lines showing values for the 6 participants, clearly divided into upward and downward trending values.

This is a fairly small set of example data, but we can use the line plot tool to check a hypothesis about cohort correlation.

  • Click Chart Type to reopen the editor.
  • Drag "Study: Cohort" to the Series box. Notice it replaces the prior selection.
  • Click Apply.

Now the line plot shows a line series of all points for the participants in each cohort. Remember that this is a filtered subset of (fictional) participants, but in this case it appears there is a correlation. You could check against the broader dataset by returning to view the data and removing the filter.

Add a Second Y Axis

To plot more data, you can add a second Y axis and display it on the right.

  • Click Chart Type to reopen the editor.
  • Drag the "CD4+" column to the Y Axis box. Notice it becomes a second panel and does not replace the prior selection (Lymphs).
  • Click the (circle arrow) to set the Y Axis Side for this measure to be on the right.
  • Click Apply.
  • You can see the trend line for each measure for each cohort in a single plot.

Change Chart Layout

The Chart Layout button offers the ability to change the look and feel of your chart.

There are four tabs:

  • General:
    • Provide a title to display on the plot. The default is the name of the source data grid.
    • Provide a subtitle to display under the title.
    • Specify a width and height.
    • Control the point size and opacity, as well as choose the default color, if no "Series" column is set.
    • Control the line width.
    • Hide Data Points: Check this box to display a simple line instead of showing shaped points for each value.
    • Number of Charts: Select whether to show a single chart, or a chart per measure, when multiple measures are defined.
    • Margins (px): If the default chart margins cause axis labels to overlap, or you want to adjust them for other reasons, you can specify them explicitly in pixels. Specify any one or all of the top, bottom, left, and right margins here.
  • X-Axis:
    • Label: Change the display label for the X axis (notice this does not change which column provides the data). Click the icon to restore the original label based on the column name.
  • Y-Axis:
    • Label: Change the display label for the Y axis as for the X axis.
    • Scale Type: Choose log or linear scale for the Y axis.
    • Range: For the Y-axis, the default is Automatic across charts. Select Automatic Within Chart to have the range based only on this chart. You can also select Manual and specify the min and max values directly.
  • Developer: Only available to users that have the "Platform Developers" site role.
    • A developer can provide a JavaScript function that will be called when a data point in the chart is clicked.
    • Provide Source and click Enable to enable it.
    • Click the Help tab for more information on the parameters available to such a function.
    • Learn more in this topic.

Adjust Chart Margins

When there are enough values on an axis that the values overlap the label, or if you want to adjust the margins of your chart for any reason, you can use the chart layout settings to set them. In our example, the date display is long and overlaps the label, so before publishing, we can improve the look.

  • Observe the example chart where the date displays overlap the label "Date".
  • Open the chart for editing, then click Chart Layout.
  • Scroll down and set the bottom margin to 140 in this example.
    • You can also adjust the other margins as needed.
    • Note that plot defaults and the length of labels can both vary, so the specific setting your plot will need may not be 140.
  • Click Apply.
  • Click Save to save with the revised margin settings.

Save and Export Plots

  • When your plot is finished, click Save.
  • Name the chart, enter a description (optional), and choose whether to make it viewable by others. You will also see the default thumbnail which has been auto-generated, and can choose whether to use it. As with other charts, you can later attach a custom thumbnail if desired.

Once you have saved a line plot, it will appear in the Data Browser and on the (charts) menu for the source dataset. You can manage metadata about it as described in Manage Data Views.

Export Chart

Hover over the chart to reveal export option buttons in the upper right corner:

Export your completed chart by clicking an option:

  • PDF: generate a PDF file.
  • PNG: create a PNG image.
  • Script: pop up a window displaying the JavaScript for the chart which you can then copy and paste into a wiki. See Tutorial: Visualizations in JavaScript for a tutorial on this feature.

Related Topics




Pie Charts


A pie chart shows the relative size of selected categories as different sized wedges of a circle or ring.

Create a Pie Chart

  • Navigate to the data grid you want to visualize.
  • Select (Charts) > Create Chart to open the editor. Click Pie.
  • The columns eligible for charting from your current grid view are listed.
  • Select the column to visualize and drag it to the Categories box.
  • Click Apply. The size of the pie wedges will reflect the count of rows for each unique value in the column selected.
  • Click View Data to see, filter, or export the underlying data.
  • Click View Chart to return. If you applied any filters, you would see them immediately reflected in the chart.

Pie Chart Customizations

  • Customize your visualization using the Chart Type and Chart Layout links in the upper right.
  • Chart Type reopens the creation dialog allowing you to:
    • Change the Categories column selection.
    • Note that you can also click another chart type on the left to switch how you visualize the data using the same selected columns when practical.
    • Click Apply to update the chart with the selected changes.

Change Layout

  • Chart Layout offers the ability to change the look and feel of your chart.
  • Customize any or all of the following options:
    • Provide a Title to show above your chart. By default, the dataset name is used.
    • Provide a Subtitle. By default, the categories column name is used. Note that changing this label does not change which column is used for wedge categories.
    • Specify the width and height.
    • Select a color palette. Options include Light, Dark, and Alternate. Mini squares showing the selected palette are displayed.
    • Customizing the radii of the pie chart allows you to size the graph and if desired, include a hollow center space.
    • Elect whether to show percentages within the wedges, the display color for them, and whether to hide those annotations when wedges are narrow. The default is to hide percentages when they are under 5%.
    • Use the Gradient % slider and color to create a shaded look.
  • Click Apply to update the chart with the selected changes.

Save and Export Charts

  • When your chart is ready, click Save.
  • Name the chart, enter a description (optional), and choose whether to make it viewable by others. You will also see the default thumbnail which has been auto-generated, and can choose whether to use it. As with other charts, you can later attach a custom thumbnail if desired.

Once you have created a pie chart, it will appear in the Data Browser and on the (charts) menu for the source dataset. You can manage metadata about it as described in Manage Data Views.

Export Chart

Hover over the chart to reveal export option buttons in the upper right corner:

Export your completed chart by clicking an option:

  • PDF: generate a PDF file.
  • PNG: create a PNG image.
  • Script: pop up a window displaying the JavaScript for the chart which you can then copy and paste into a wiki. See Tutorial: Visualizations in JavaScript for a tutorial on this feature.

Videos

Related Topics




Scatter Plots


Scatter plots represent the relationship between two different numeric measurements. Each dot is positioned based on the value of the values selected for the X and Y axes.

Create a Scatter Plot

  • Navigate to the data grid you want to visualize.
  • Select (Charts) > Create Chart. Click Scatter.
  • The columns eligible for charting from your current grid view are listed.
  • Select the X Axis column by drag and drop.
  • Select the Y Axis column by drag and drop.
  • Only the X and Y Axes are required to create a basic scatter plot. Other options will be explored below.
  • Click Apply to see the basic plot.
  • Click View Data to see, filter, or export the underlying data.
  • Click View Chart to return. If you applied any filters, you would see them immediately reflected in the plot.
  • Customize your visualization using the Chart Type and Chart Layout links in the upper right.
  • Chart Type reopens the creation dialog allowing you to:
    • Change the X or Y Axis column (hover and click the X to delete the current selection).
    • Add a second Y Axis column (see below) to show more data.
    • Optionally select columns for grouping of points by color or shape.
    • Note that you can also click another chart type on the left to switch how you visualize the data with the same axes and color/shape groupings when practical.
    • Click Apply to update the chart with the selected changes.
  • Here we see the same scatter plot data, with colors varying by cohort and points shaped based on gender. Notice the key in the upper right.

Change Layout

The Chart Layout button offers the ability to change the look and feel of your chart.

There are four tabs:

  • General:
    • Provide a title to display on the plot. The default is the name of the source data grid.
    • Provide a subtitle to display under the title.
    • Specify a width and height.
    • Choose whether to jitter points.
    • Control the point size and opacity, as well as choose the default color palette. Options: Light (default), Dark, and Alternate. The array of colors is shown under the selection.
    • Number of Charts: Select either "One Chart" or "One Per Measure".
    • Group By Density: Select either "Always" or "When number of data points exceeds 10,000."
    • Grouped Data Shape: Choose either hexagons or squares.
    • Density Color Palette: Options are blue & white, heat (yellow/orange/red), or select a single color from the dropdown to show in graded levels. These palettes override the default color palette and other point options in the left column.
    • Margins (px): If the default chart margins cause axis labels to overlap, or you want to adjust them for other reasons, you can specify them explicitly in pixels. Specify any one or all of the top, bottom, left, and right margins. See an example here.
  • X-Axis/Y-Axis:
    • Label: Change the display labels for the axis (notice this does not change which column provides the data). Click the icon to restore the original label based on the column name.
    • Scale Type: Choose log or linear scale for each axis.
    • Range: Let the range be determined automatically or specify a manual range (min/max values) for each axis.
  • Developer: Only available to users that have the "Platform Developers" site role.
    • A developer can provide a JavaScript function that will be called when a data point in the chart is clicked.
    • Provide Source and click Enable to enable it.
    • Click the Help tab for more information on the parameters available to such a function.
    • Learn more in this topic.

Add Second Y Axis

You can add more data to a scatter plot by selecting a second Y axis column. Reopen a chart for editing, click Chart Type, then drag another column to the Y Axis field. The two selected fields will both have panels. On each you can select the side for the Y Axis using the arrow icons.

For this example, we've removed the color and shape columns to make it easier to see the two axes in the plot. Click Apply.

If you use the Chart View > Number of Charts > One Per Measure, you will see two separate charts, still respecting the Y Axis sides you set.

Example: Heat Map

Displaying a scatter plot as a heatmap is done by changing the layout of a chart. Very large datasets are easier to interpret as heatmaps, grouped by density (also known as point binning).

  • Click Chart Layout and change Group By Density to "Always".
  • Select Heat as the Density Color Palette and leave the default Hexagon shape selected
  • Click Apply to update the chart with the selected changes. Shown here, the number of charts was reset to one, and only a single Y axis is included.
  • Notice that when binning is active, a warning message will appear reading: "The number of individual points exceeds XX. The data is now grouped by density which overrides some layout options." XX will be either 10,000 or 1, if you selected "Always" as we did. Click Dismiss to remove that message from the plot display.

Save and Export Plots

  • When your plot is finished, click Save.
  • Name the plot, enter a description (optional), and choose whether to make it viewable by others. You will also see the default thumbnail which has been auto-generated, and can choose whether to use it. As with other charts, you can later attach a custom thumbnail if desired.

Once you have saved a scatter plot, it will appear in the Data Browser and on the (charts) menu for the source dataset. You can manage metadata about it as described in Manage Data Views.

Export Plot

Hover over the chart to reveal export option buttons in the upper right corner:

Export your completed chart by clicking an option:

  • PDF: generate a PDF file.
  • PNG: create a PNG image.
  • Script: pop up a window displaying the JavaScript for the chart which you can then copy and paste into a wiki. See Tutorial: Visualizations in JavaScript for a tutorial on this feature.

Video

Related Topics




Time Charts


Time charts provide rich time-based visualizations for datasets and are available in LabKey study folders. In a time chart, the X-axis shows a calculated time interval or visit series, while the Y-axis shows one or more numerical measures of your choice.

Note: Only properties defined as measures in the dataset definition can be plotted on time charts.
With a time chart you can:
  • Individually select which study participants, cohorts, or groups appear in the chart.
  • Refine your chart by defining data dimensions and groupings.
  • Export an image of your chart to a PDF or PNG file.
  • Export your chart to Javascript (for developers only).
Note: In a visit-based study, visits are a way of measuring sequential data gathering. To create a time chart of visit based data, you must first create an explicit ordering of visits in your study. Time charts are not supported for continuous studies, because they contain no calculated visits/intervals.

Create a Time Chart

  • Navigate to the dataset, view, or query of interest in your study. In this example, we use the Lab Results dataset in the example study.
  • Select (Charts) > Create Chart. Click Time.
  • Whether the X-axis is date based or visit-based is determined by the study type. For a date-based study:
    • Choose the Time Interval to plot: Days, Weeks, Months, Years.
    • Select the desired Interval Start Date from the pulldown menu. All eligible date fields are listed.
  • At the top of the right panel is a drop down from which you select the desired dataset or query. Time charts are only supported for datasets/queries in the "study" schema which include columns designated as 'measures' for plotting. Queries must also include both the 'ParticipantId' and 'ParticipantVisit' columns to be listed here.
  • The list of columns designated as measures available in the selected dataset or query is shown in the Columns panel. Drag the desired selection to the Y-Axis box.
    • By default the axis will be shown on the left; click the right arrow to switch sides.
  • Click Apply.
  • The time chart will be displayed.
  • Use the checkboxes in the Filters panel on the left:
    • Click a label to select only that participant.
    • Click a checkbox to add or remove that participant from the chart.
  • Click View Data to see the underlying data.
  • Click View Chart(s) to return.

Time Chart Customizations

  • Customize your visualization using the Chart Type and Chart Layout links in the upper right.
  • Chart Type reopens the creation dialog allowing you to:
    • Change the X Axis options for time interval and start date.
    • Change the Y Axis to plot a different measure, or plot multiple measures at once. Time charts are unique in allowing cross-query plotting. You can select measures from different datasets or queries within the same study to show on the same time chart.
      • Remove the existing selection by hovering and clicking the X. Replace with another measure.
      • Add a second measure by dragging another column from the list into the Y-Axis box.
      • For each measure you can specify whether to show the Y-axis for it on the left or right.
      • Open and close information panels about time chart measures by clicking on them.
    • Click Apply to update the chart with the selected changes.

Change Layout

  • Chart Layout offers the ability to change the look and feel of your chart.

There are at least 4 tabs:

  • On the General tab:
    • Provide a Title to show above your chart. By default, the dataset name is used.
    • Specify the width and height of the plot.
    • Use the slider to customize the Line Width.
    • Check the boxes if you want to either Hide Trend Line or Hide Data Points to get the appearance you prefer. When you check either box, the other option becomes unavailable.
    • Number of Charts: Choose whether to show all data on one chart, or separate by group, or by measure.
    • Subject Selection: By default, you select participants from the filter panel. Select Participant Groups to enable charting of data by groups and cohorts using the same checkbox filter panel. Choose at least one charting option for groups:
      • Show Individual Lines: show plot lines for individual participant members of the selected groups.
      • Show Mean: plot the mean value for each participant group selected. Use the pull down to select whether to include range bars when showing mean. Options are: "None, Std Dev, or Std Err".
  • On the X-Axis tab:
    • Label: Change the display label shown on the X-axis. Note that changing this text will not change the interval or range plotted. Use the Chart Type settings to change what is plotted.
    • Range: Let the range be determined automatically or specify a manual range (min/max values).
  • There will be one Y-Axis tab for each side of the plot if you have elected to use both the left and right Y-axes. For each side:
    • Label Change the display labels for this Y-axis. Note that changing this text will not change what is plotted. Click the icon to restore the original label based on the column name.
    • Scale Type: Choose log or linear scale for each axis.
    • Range: Let the range be determined automatically or specify a manual range (min/max values) for each axis.
    • For each Measure using that Y-axis:
      • Choose an Interval End Date. The pulldown menu includes eligible date columns from the source dataset or query.
      • Choose a column if you want to Divide Data Into Series by another measure.
      • When dividing data into series, choose how to display duplicate values (AVG, COUNT, MAX, MIN, or SUM).
  • Developer: Only available to users that have the "Platform Developers" site role.
    • A developer can provide a JavaScript function that will be called when a data point in the chart is clicked.
    • Provide Source and click Enable to enable it.
    • Click the Help tab for more information on the parameters available to such a function.
    • Learn more in this topic.
  • Click Apply to update the chart with the selected changes. In this example, we now plot data by participant group. Note that the filter panel now allows you to plot trends for cohorts and other groups. This example shows a plot combining trends for two measures, lymphs and viral load, for two study cohorts.

Save Chart

  • When your chart is ready, click Save.
  • Name the chart, enter a description (optional), and choose whether to make it viewable by others. You will also see the default thumbnail which has been auto-generated, and can choose whether to use it. As with other charts, you can later attach a custom thumbnail if desired.
  • Click Save.

Once you have created a time chart, it will appear in the Data Browser and on the charts menu for the source dataset.

Data Dimensions

By adding dimensions for a selected measure, you can further refine the timechart. You can group data for a measure on any column in your dataset that is defined as a "data dimension". To define a column as a data dimension:

  • Open a grid view of the dataset of interest.
  • Click Manage.
  • Click Edit Definition.
  • Click the Fields section to open it.
  • Click to expand the field of interest.
  • Click Advanced Settings.
  • Place a checkmark next to Make this field available as a dimension.
  • Click Apply.
  • Click Save.

To use the data dimension in a time chart:

  • Click View Data to return to your grid view.
  • Create a new time chart, or select one from the (Charts) menu and click Edit.
  • Click Chart Layout.
  • Select the Y-Axis tab for the side of the plot you are interested in (if both are present.
    • The pulldown menu for Divide Data Into Series By will include the dimensions you have defined.
  • Select how you would like duplicate values displayed. Options: Average, Count, Max, Min, Sum.
  • Click Apply.
  • A new section appears in the filters panel where you can select specific values of the new data dimension to further refine your chart.

Export Chart

Hover over the chart to reveal export option buttons in the upper right corner:

Export your completed chart by clicking an option:

  • PDF: generate a PDF file.
  • PNG: create a PNG image.
  • Script: pop up a window displaying the JavaScript for the chart which you can then copy and paste into a wiki. See Tutorial: Visualizations in JavaScript for a tutorial on this feature.

Related Topics




Column Visualizations


Click a column header to see a list of Column Visualizations, small plots and charts that apply to a single column. When selected, the visualization is added to the top of the data grid. Several can be added at a time, and they are included within a saved custom grid view. When you come back to the saved view, the Column Visualizations will appear again.
  • Bar Chart - Histogram displayed above the grid.
  • Box & Whisker - Distribution box displayed above the grid.
  • Pie Chart - Pie chart displayed above the grid.

Visualizations are always 'live', reflecting updates to the underlying data and any filters added to the data grid.

To remove a chart, hover over the chart and click the 'X' in the upper right corner.

Available visualization types are determined by datatype as well as whether the column is a Measure and/or a Dimension.

  • The box plot option is shown for any column marked as a Measure.
  • The bar and pie chart options are shown for any column marked as a Dimension.
Column visualizations are simplified versions of standalone charts of the same types. Click any chart to open it within the plot editor which allows you to make many additional customizations and save it as a new standalone chart.

Bar Chart

A histogram of the Weight column.

Box and Whisker Plot

A basic box plot report. You can include several column visualizations above a grid simultaneously.

Pie Chart

A pie chart showing prevalence of ARV Regimen types.

Filters are also applied to the visualizations displayed. If you filter to exclude 'blank' ARV treatment types, the pie chart will update.

Related Topics




Quick Charts


Quick Charts provide a quick way to assess your data without deciding first what type of visualization you will use. A best guess visualization for the data in a single column is generated and can be refined from there.

Create a Quick Chart

  • Navigate to a data grid you wish to visualize.
  • Click a column header and select Quick Chart.

Depending on the content of the column, LabKey Server makes a best guess at the type and arrangement of chart to use as a starting place. A numeric column in a cohort study, for example, might be quickly charted as a box and whisker plot using cohorts as categories.

Refine the Chart

You can then alter and refine the chart in the following ways:

  • View Data: Toggle to the data grid, potentially to apply filters to the underlying data. Filters are reflected in the plot upon re-rendering.
  • Export: Export the chart as a PDF, PNG, or Script.
  • Help: Documentation links.
  • Chart Type: Click to open the plot editor. You can change the plot type to any of the following and the options for chart layout settings will update accordingly. In the plot editor, you can also incorporate data from other columns.
  • Chart Layout: Click to customize the look and feel of your chart; options available vary based on the chart type. See individual chart type pages for a descriptions of options.
  • Save: Click to open the save dialog.

Related Topics




Integrate with Tableau


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey.
Users can integrate their LabKey Server with Tableau Desktop via an Open Database Connectivity (ODBC) integration. Data stored in LabKey can be dynamically queried directly from the Tableau application to create reports and visualizations.

Configure LabKey

LabKey must first be configured to accept external analytics integrations using an ODBC connection. At all times, LabKey security settings govern who can see which data; any user must have at least the Reader role to access data from Tableau.

Learn more about setting up the ODBC connection in LabKey in these topics:

Configure Tableau

Once configured, load LabKey data into Tableau following the instructions in this section:

Use LabKey Data in Tableau

Within Tableau, once you have loaded the data table you wish to use, you can see the available measures and dimensions listed. Incorporate them into the visualizations by dragging and dropping. Instructions for using Tableau Desktop can be found on their website.

In this example, the Physical Exam and Demographics joined view is being used to create a plot showing the CD4 levels over time for two Treatment Groups, those treated and not treated with ARV regimens.

Tableau also offers the ability to easily create trend lines. Here an otherwise cluttered scatter plot is made clearer using trend lines:

You can build a wide variety of visualizations and dashboards in Tableau.

Video Demonstration

This video covers the ways in which LabKey Server and Tableau Desktop are great partners for creating powerful visualizations.

Related Topics




Lists


A List is a user-defined table that can be used for a variety of purposes:
  • As a data analysis tool for spreadsheet data and other tabular-format files, such as TSVs and CSVs.
  • As a place to store and edit data entered by users via forms or editable grids
  • To define vocabularies, which can be used to constrain choices during completion of fields in data entry forms
  • As read-only resources that users can search, filter, sort, and export
The design, or schema, of a list is the set of fields (columns and types), including the identification of the primary key. Lists can be linked via lookups and joins to draw data from many sources. Lists can be indexed for search, including optional indexing of any attachments added to fields. Populated lists can be exported and imported as archives for easy transfer between folders or servers.

Topics

List Web Parts

You need to be an administrator to create and manage lists. You can directly access the list manager by selecting (Admin) > Manage Lists. To make the set of lists visible to other users, and create a one click shortcut for admins to manage lists, add a Lists web part to your project or folder.

Lists Web Part

  • Enter > Page Admin Mode.
  • Choose Lists from the <Select Web Part> pulldown at the bottom of the page.
  • Click Add.
  • Click Exit Admin Mode.

List-Single Web Part

To display the contents of a single list, add a List - Single web part, name it and choose the list and view to display.




Tutorial: Lists


This tutorial introduces you to Lists, the simplest way to represent tabular data in LabKey. While straightforward, lists support many advanced tools for data analysis. Here you will learn about the power of lookups, joins, and URL properties for generating insights into your results.

This tutorial can be completed using a free 30-day trial version of LabKey Server.

Lists can be simple literal one column "lists" or many columned tables with a set of values for each "item" on the list. A list must always have a unique primary key - if your data doesn't have one naturally, you can use an auto-incrementing integer to guarantee uniqueness.

Lists offer many data connection and analysis opportunities we'll begin to explore in this tutorial:

  • Lookups can help you display names instead of code numbers, present options to users adding data, and interconnect tables.
  • Joins help you view and present data from related lists together in shared grids without duplicating information.
  • URL properties can be used to filter a grid view of a list OR help you link directly to information stored elsewhere.
Completing this tutorial as written requires administrative permissions, which you will have if you create your own server trial instance in the first step. The features covered are not limited to admin users.

Tutorial Steps

First Step




Step 1: Set Up List Tutorial


In this first step, we set up a folder to work in, learn to create lists, then import a set of lists to save time.

Set Up

  • Log in to your server and navigate to your "Tutorials" project. Create it if necessary.
    • If you don't already have a server to work on where you can create projects, start here.
    • If you don't know how to create projects and folders, review this topic.
  • Create a new subfolder named "List Tutorial". Accept all defaults.

Create a New List

When you create a new list, you define the properties and structure of your table.

  • In the Lists web part, click Manage Lists.
  • Click Create New List.
  • In the List Properties panel, name the list "Technicians".
  • Under Allow these Actions, notice that Delete, Upload, Export & Print are all allowed by default.
  • You could adjust things like how your list would be indexed on the Advanced Settings popup. Leave these settings at their defaults for now.

  • You'll see the set of fields that define the columns in your list.
  • In the blue banner, you must select a field to use as the primary key. Using the name could be non-unique, and even badge numbers might be reassigned, so select Auto integer key from the dropdown and notice that a "Key" field is added to your list.
  • Click Save.
  • You'll see the empty "frame" for your new list with column names but "No data to show."
  • Now the "Technicians" list will appear in the Lists web part when you return to the main folder page.

Populate a List

You can populate a list one row at a time or in bulk by importing a file (or copying and pasting the data).

  • If you returned to the List Tutorial main page, click the Technicians list name to reopen the empty frame.
  • Click (Insert data) and select Insert new row.
  • Enter your own name, make up a "Department" and use any number as your badge number.
  • Click Submit.
  • Your list now has one row.
  • Download this file: Technicians.xls
  • Click (Insert data) and select Import bulk data.
  • Click the Upload file panel to open it, then click Browse (or Choose File) and choose the "Technicians.xls" file you downloaded.
    • Alternately, you could open the file and copy paste the contents (including the headers) into the copy/paste text panel instead.
  • Click Submit.

You will now see a list of some "Technicians" with their department names. Before we continue, let's add a few more lists to the folder using a different list creation option below.

Import a List Archive

A list archive can be exported from existing LabKey lists in a folder, or manually constructed following the format. It provides a bulk method for creating and populating sets of lists.

  • Click here to download the Tutorial.lists.zip archive. Do not unzip it.
  • Select (Admin) > Manage Lists.
  • Click Import List Archive.
  • Click Choose File and select Tutorial.lists.zip from where you downloaded it.
  • Click Import List Archive.

Now you will see several additional lists have been added to your folder. Click the names to review, and continue the tutorial using this set of lists.

Related Topics

Start Over | Next Step (2 of 3)




Step 2: Create a Joined Grid


You can interconnect lists with each other by constructing "lookups", then use those connections to present joined grids of data within the user interface.

Connect Lists with a Lookup

  • Click the Technicians list from the Lists web part.

The grid shows the list of technicians and departments, including yourself. You can see the Department column displays a text name, but is currently unlinked. We can connect it to the "Departments" list uploaded in the archive. Don't worry that the department you typed is probably not on that list.

  • Click Design to open the list design for editing.
  • Click the Fields section to open it.
  • For the Department field, click the Text selector and choose the Data Type Lookup.
  • The details panel for the field will open. Under Lookup Definition Options:
    • Leave the Target Folder set to the current folder.
    • Set the Target Schema to "lists".
    • On the Target Table dropdown, select "Departments (String)".
  • Click Save.

Now the list shows the values in the Department column as links. If you entered something not on the list, it will not be linked, but instead be plain text surrounded by brackets.

Click one of the links to see the "looked up" value from the Departments list. Here you will see the fields from the "Departments" list: The contact name and phone number.

You may have noticed that while there are several lists in this folder now, you did not see them all on the dropdown for setting the lookup target. The field was previously a "Text" type field, containing some text values, so only lists with a primary key of that "Text" type are eligible to be the target when we change it to be a lookup.

Create a Joined Grid

What if you want to present the details without your user having to click through? You can easily "join" these two lists as follows.

  • Select (Admin) > Manage Lists.
  • Click Technicians.
  • Select (Grid Views) > Customize Grid.
  • In the Available Fields panel, notice the field Department now shows an (expansion) icon. Click it.
  • Place checkmarks next to the Contact Name and Contact Phone fields (to add them to the Selected Fields panel).
  • Click View Grid.
  • Now you see two additional columns in the grid view.
  • Above the grid, click Save.
  • In the Save Custom Grid View popup menu, select Named and name this view DepartmentJoinedView.
  • Click Save in the popup.

You can switch between the default view of the Technicians list and this new joined grid on the (Grid Views) menu.

Use the Lookup to Assist Input

Another reason you might use a lookup field is to help your users enter data.

  • Hover over the row you created first (with your own name).
  • Click the (pencil) icon that will appear.
  • Notice that instead of the original text box, the entry for "Department" is now done with a dropdown.
    • You may also notice that you are only editing fields for the local list - while the grid showed contact fields from the department list, you cannot edit those here.
  • Select a value and click Submit.

Now you will see the lookup also "brought along" the contact information to be shown in your row.

In the next step, we'll explore using the URL attribute of list fields to make more connections.

Previous Step | Next Step (3 of 3)




Step 3: Add a URL Property


In this previous step, we used lookups to link our lists to each other. In this step, we explore two ways to use the URL property of list fields to create other links that might be useful to researchers using the data.

Create Links to Filtered Results

It can be handy to generate an active filtering link in a list (or any grid of data). For example, here we use a URL property to turn the values in the Department column into links to a filtered subset of the data. When you click one value, you get a grid showing only rows where that column has the same value.

  • If you navigated away from the Technicians list, reopen it.
  • When you click a value in the "Department" column, notice that you currently go to the contact details. Go back to the Technicians list.
  • Click the column header Department to open the menu, then select Filter....
  • Click the label Executive to select only that single value.
  • Click OK to see the subset of rows.
  • Notice the URL in your browser, which might look something like this - the full path and your list ID number may vary, but the filter you applied (Department = Executive) is encoded at the end.
    http://localhost:8080/labkey/Tutorials/List%20Tutorial/list-grid.view?listId=277&query.Department~eq=Executive
  • Clear the filter using the in the filter bar, or by clicking the column header Department and then clicking Clear Filter.

Now we'll modify the design of the list to turn the values in the Department column into custom links that filter to just the rows for the value that we click.

  • Click Design.
  • Click the Fields section to expand it.
  • Scroll down and click the expansion icon for the Department field.
  • Copy and paste this value into the URL field:
/list-grid.view?name=Technicians&query.Department~eq=${Department}
    • This URL starts with "/" indicating it is local to this container.
    • The filter portion of this URL replaces "Executive" with the substitution string "${Department}", meaning the value of the Department column. (If we were to specify "Executive", clicking any Department link in the list would filter the list to only show the executives!)
    • The "listId" portion of the URL has been replaced with "name=Technicians." This allows the URL to work even if exported to another container where the listId might be different.
  • Scroll down and click Save.

Now notice that when you click a value in the "Department" column, you get the filtering behavior we just defined.

  • Click Documentation in any row and you will see the list filtered to display only rows for that value.

Learn more about ways to customize URLs in this topic: URL Field Property

Create Links to Outside Resources

A column value can also become a link to a resource outside the list, and even outside the server. All the values in a column could link to a fixed destination (such as to a protocol document or company web page) or you can make row-specific links to files where a portion of the link URL matches a value in the row such as the Badge Number in this example.

For this example, we've stored some images on our support site, so that you can try out syntax for using both a full URL to reference a non-local destination AND the use of a field value in the URL. In this case, images are stored as <badge number>.png; in actual use you might have locally stored slide images or other files of interest named by subjectId or another column in your data.

Open this link in a new browser window:

You can directly edit the URL, substituting the other badge IDs used in the Technicians list you loaded (701, 1802, etc) where you see 104 in the above.

Here is a generalized version, using substitution syntax for the URL property, for use in the list design.

http://www.labkey.org/files/home/Demos/ListDemo/sendFile.view?fileName=%40files%2F${Badge}.png&renderAs=IMAGE

  • Click the List Tutorial link near the top of the page, then click the Technicians list in the Lists web part.
  • Click Design.
  • Click the Fields section to expand it.
  • Click the expansion icon to expand the Badge field.
  • Into the URL property for this field, paste the generalized version of the link shown above.
  • Click Save.
  • Observe that clicking one of the Badge Number values will open the image with the same name.
    • If you edit your own row to set your badge number to "1234" you will have an image as well. Otherwise clicking a value for which there is no pre-loaded image will raise an error.

Congratulations

You've now completed the list tutorial. Learn more about lists and customizing the URL property in the related topics.

Related Topics

Previous Step




Create Lists


A list is a basic way of storing tabular information. LabKey lists are flexible, user-defined tables that can have as many columns as needed. Lists must have a primary key field that ensures rows are uniquely identified.

The list design is the set of columns and types, which forms the structure of the list, plus the identification of the primary key field, and properties about the list itself.

Create New List and Set Basic Properties

  • In the project or folder where you want the list, select (Admin) > Manage Lists.
  • Click Create New List.
  • Name the list, i.e. "Technicians" in this example.
  • Adding a Description is optional, but can give other users more information about the purpose of this list.
  • Use the checkboxes to decide whether to Allow these Actions for the list:
    • Delete
    • Upload
    • Export & Print
  • Continue to define fields and other settings before clicking Save.

Set Advanced Properties

  • Click Advanced Settings to set more properties (listed below the image).
  • Default Display Field: Once some fields have been defined, use this dropdown to select the field that should be displayed by default when this list is used as a lookup target.
  • Discussion Threads let people create discussions associated with this list. Options:
    • Disable discussions (Default)
    • Allow one discussion per item
    • Allow multiple discussions per item
  • Search Indexing Options (by default, no options are selected):
    • Index entire list as a single document
    • Index each item as a separate document
    • Index file attachments
    • Learn more about indexing options here: Edit a List Design
  • Click Apply when finished.

Continue to define the list fields before clicking Save.

Define List Fields and Set Primary Key

The fields in the list define which columns will be included. There must be a primary key column to uniquely identify the rows. The key can either be an integer or text field included with your data, OR you can have the system generate an auto-incrementing integer key that will always be unique. Note that if you select the auto-incrementing integer key, you will not have the ability to merge list data.

You have three choices for defining fields:

Manually Define Fields

  • Click the Fields section to open the panel.
  • Click Manually Define Fields (under the drag and drop region).
  • Key Field Name:
    • If you want to use an automatically incrementing integer key, select Auto integer key. You can rename the default Key field that will be added.
    • If you want to use a different field (of Integer or Text type), first define the fields, then select from this dropdown.
  • Use Add Field to add the fields for your list.
    • Specify the Name and Data Type for each column.
    • Check the Required box to make providing a value for that field mandatory.
    • Open a field to define additional properties using the expansion icon.
    • Remove a field if necessary by clicking the .
  • Details about adding fields and editing their properties can be found in this topic: Field Editor.
  • Scroll down and click Save when you are finished.

Infer Fields from a File

Instead of creating the list fields one-by-one you can infer the list design from the column headers of a sample spreadsheet. When you first click the Fields section, the default option is to import or infer fields. Note that inferring fields is only offered during initial list creation and cannot be done when editing a list design later. If you start manually defining fields and decide to infer instead, delete the manually defined fields and you will see the inferral option return.

  • Click here to download this file: Technicians.xls
  • Select (Admin) > Manage Lists and click Create New List.
  • Name the list, i.e. "Technicians2" so you can compare it to the list created above.
  • Click the Fields section to open it.
  • Drag and drop the downloaded "Technicians.xls" file into the target area.
  • The fields will be inferred and added automatically.
    • Note that if your file includes columns for reserved fields, they will not be shown as inferred. Reserved fields will always be created for you.
  • Select the Key Field Name - in this case, select Auto integer key to add a new field to provide our unique key.
  • If you need to make changes or edit properties of these fields, follow the instructions above or in the topic: Field Editor.
  • Below the fields section, you will see Import data from this file upon list creation?
  • By default the contents of the spreadsheet you used for inferral will be imported to this list when you click Save.
  • If you do not want to do that, click the Import Data slider to disable the import portion.
  • Scroll down and click Save.
  • Click "Technicians2" to see that the field names and types are inferred forming the header row, but no data was imported from the spreadsheet.

Export/Import Field Definitions

In the top bar of the list of fields, you see an Export button. You can click to export field definitions in a JSON format file. This file can be used to create the same field definitions in another list, either as is or with changes made offline.

To import a JSON file of field definitions, use the infer from file method, selecting the .fields.json file instead of a data-bearing file. Note that importing or inferring fields will overwrite any existing fields; it is intended only for new list creation.

You'll find an example of using this option in the first step of the List Tutorial

Learn more about exporting and importing sets of fields in this topic: Field Editor

Shortcut: Infer Fields and Populate a List from a Spreadsheet

If you want to both infer the fields to design the list and populate the new list with the data from the spreadsheet, follow the inferral of fields process above, but leave the Import Data section enabled as shown below. The first three rows are shown in the preview.

  • Click Save and the entire spreadsheet of data will be imported as the list is created.
Note that data previews do not apply field formatting defined in the list itself. For example, Date and DateTime fields are always shown in ISO format (yyyy-MM-dd hh:mm) regardless of source data or destination list formatting that will be applied after import. Learn more in this topic.

Related Topics




Edit a List Design


Editing the list design allows you change the structure (columns) and functions (properties) of a list, whether or not it has been populated with data. To see and edit the list design, click Design above the grid view of the list. You can also select (Admin) > Manage Lists and click Design next to the list name. If you do not see these options, you do not have permission to edit the given list.

List Properties

The list properties contain metadata about the list and enable various actions including export and search.

The properties of the Technicians list in the List Tutorial Demo look like this:

  • Name: The displayed name of the list.
  • Description: An optional description of the list.
  • Allow these Actions: These checkboxes determine what actions are allowed for the list. All are checked by default.
    • Delete
    • Upload
    • Export and Print
  • Click Advanced Settings to see additional properties in a popup:
    • Default Display Field: Identifies the field (i.e., the column of data) that is used when other lists or datasets do lookups into this list. You can think of this as the "lookup display column." Select a specific column from the dropdown or leave the default "<AUTO>" selection, which uses this process:
      • Use the first non-lookup string column (this could be the key).
      • If there are no string fields, use the primary key.
    • Discussion Threads: Optionally allow discussions to be associated with each list item. Such links will be exposed as a "discussion" link on the details view of each list item. Select one of:
      • Disable discussions (Default)
      • Allow one discussion per item
      • Allow multiple discussions per item
    • Search Indexing Options. Determines how the list data, metadata, and attachments are indexed for full-text searching. Details are below.

List Fields

Click the Fields section to add, delete or edit the fields of your list. Click the expansion icon to edit details and properties for each field. Learn more in the topic: Field Editor.

Example. The field editor for the Technicians list in the List Tutorial Demo looks like this:

Customize the Order of List Fields

By default, the order of fields in the default grid is used to order the fields in insert, edit and details for a list. All fields that are not in the default grid are appended to the end. To see the current order, click Insert New for an existing list.

To change the order of fields, drag and drop them in the list field editor using the six-block handle on the left. You can also modify the default grid for viewing columns in different orders. Learn more in the topic: Customize Grid Views.

List Metadata and Hidden Fields

In addition to the fields you define, there is list metadata associated with every list. To see and edit it, use the schema browser. Select (Admin) > Developer Links > Schema Browser. Click lists and then select the specific list. Click Edit Metadata.

List metadata includes the following fields in addition to any user defined fields.

NameDatatypeDescription
Createddate/timeWhen the list was created
CreatedByint (user)The user who created the list
Modifieddate/timeWhen the list was last modified
ModifiedByint (user)The user who modified the list
containerfolderThe folder or project where the list is defined
lastIndexeddate/timeWhen the list was last indexed
EntityIdtextA unique identifier for this list

There are several built in hidden fields in every list. To see them, open (Grid Views) > Customize Grid and check the box for Show Hidden Fields.

NameDatatypeDescription
Last Indexeddate/timeWhen this list was last indexed.
KeyintThe key field (if not already shown).
Entity IdtextThe unique identifier for the list itself.
FolderobjectThe container where this list resides.

Full-Text Search Indexing

You can control how your list is indexed for search depending on your needs. Choose one or more of the options on the Advanced List Settings popup under the Search Indexing Options section. Clicking the for the first two options adds additional options in the popup:

When indexing either the entire list or each item separately, you also specify how to display the title in search results and which fields to index. Note that you may choose to index *both* the entire list and each item, potentially specifying different values for each of these options. When you specify which fields in the list should be indexed, Do not include fields that contain PHI or PII. Full text search results could expose this information.

Index entire list as a single document

  • Document Title: Any text you want displayed and indexed as the search result title (ex. ListName - ${Key} ${value}). Leave the field blank to use the default title.
  • Select one option for the metadata/data:
    • Include both metadata and data: Not recommended for large lists with frequent updates, since updating any item will cause re-indexing of the entire list.
    • Include data only: Not recommended for large lists with frequent updates, since updating any item will cause re-indexing of the entire list.
    • Include metadata only (name and description of list and fields). (Default)
  • Select one option for indexing of PHI (protected health information):
    • Index all non-PHI text fields
    • Index all non-PHI fields (text, number, date, and boolean)
    • Index using custom template: Choose the exact set of fields to index and enter them as a template in the box when you select this option. Use substitution syntax like, for example: ${Department} ${Badge} ${Name}.

Index each item as a separate document

  • Document Title: Any text you want displayed and indexed as the search result title (ex. ListName - ${Key} ${value}). Leave the field blank to use the default title.
  • Select one option for indexing of PHI (protected health information):
    • Index all non-PHI text fields
    • Index all non-PHI fields (text, number, date, and boolean)
    • Index using custom template: Choose the exact set of fields to index and enter them as a template in the box when you select this option. Use substitution syntax like, for example: ${Department} ${Badge} ${Name}.

Related Topics




Populate a List


Once you have created a list, there are a variety of options for populating it, designed to suit different kinds of list and varying complexity of data entry. Note that you can also simultaneously create a new list and populate it from a spreadsheet. This topic covers populating an existing list. In each case, you can open the list by selecting (Admin) > Manage Lists and clicking the list name. If your folder includes a Lists web part, you can click the list name directly there.

Insert Single Rows

One option for simple lists is to add a single row at a time:

  • Select (Insert data) > Insert new row.
  • Enter the values for each column in the list.
  • Click Submit.

Import Bulk Data

You can also import multiple rows at once by uploading a file or copy/pasting text. To ensure the format is compatible, particularly for complex lists, you can first download a template, then populate it with your data prior to upload.

Copy/Paste Text

  • Select (Insert data) > Import bulk data.
  • Click Download Template to obtain a template for your list that you can fill out.
  • Copy and paste the spreadsheet contents into the text box, including the header row. Using our "Technicians" list example, you can copy and paste this spreadsheet:
NameDepartmentBadge
Abraham LincolnExecutive104
HomerDocumentation701
Marie CurieRadiology88

  • Click Submit.
  • The pasted rows will be added to the list.

Upload File

Another way to upload data is to directly upload an .xlsx, .xls, .csv, or .txt file containing data.

  • Again select (Insert data) > Import bulk data.
  • Using Download Template to create the file you will populate can ensure the format will match.
  • Click the Upload file (.xlsx, .xls, .csv., .txt) section heading to open it.
  • Use Browse or Choose File to select the File to Import.
  • Click Submit.

Merge List Data

If your list has an integer or text key, you have the option to merge list data during bulk import. Both the copy/paste and file import options offer a checkbox to Update data for existing rows during import.

If this box is checked, you can successfully import data for rows that already exist in your list. When unchecked, attempting to import for existing rows will raise an error.

The merge option is not available for lists with an auto-incrementing integer key. These lists always create a new row (and increment the key) during import, so you cannot include matching keys in your imported data.

Import Lookups By Alternate Key

When importing data into a list, either by copy paste or from a file, you can use the checkbox to Import Lookups by Alternate Key. This allows lookup target rows to be resolved by values other than the target's primary key. It will only be available for lookups that are configured with unique column information. For example, tables in the "samples" schema (representing Samples) use the RowId column as their primary key, but their Name column is guaranteed to be unique as well. Imported data can use either the primary key value (RowId) or the unique column value (Name). This is only supported for single-column unique indices. See Import Samples.

View the List

Your list is now populated. You can see the contents of the list by clicking on the name of the list in the Lists web part. An example:

Hovering over a row will reveal icon buttons in the first column:

Edit List Rows

Click the icon to edit a row in a list. You'll see entry fields as when you insert a row, populated with the current values. Make changes as needed and click submit.

Changes to lists are audited under List Events in the main audit log. You can also see the history for a specific list by selecting (Admin) > Manage Lists and clicking View History for the list in question. Learn more here: Manage Lists.

Related Topics




Manage Lists


A list is a flexible, user-defined table. To manage all the lists in a given container, an administrator can select (Admin) > Manage Lists, or click Manage Lists in the Lists web part.

Manage Lists

An example list management page from an HIV study:

  • (Grid Views): Customize how this grid of lists is displayed and create custom grid views.
  • (Charts/Reports): Add a chart or report about the set of lists.
  • (Delete): Select one or more lists using the checkboxes to activate deletion. Both the data and the list design are removed permanently from your server.
  • (Export): Export to Excel, text, or script.
  • Create New List
  • Import List Archive
  • Export List Archive: Select one or more lists using the checkboxes and export as an archive.
  • (Print): Print the grid of lists.

Manage a Specific List

For each list shown in the Lists web part, you can:

  • Design: Click to view or edit the design, i.e. the set of fields and properties that define the list, including allowable actions and indexing. Learn more in this topic: Edit a List Design.
  • View History: See a record of all list events and design changes.
  • Click the Name of the list to see all contents of the list shown as a grid. Options offered for each list include:
    • (Grid Views): Create custom grid views of this list.
    • (Charts/Reports): Create charts or reports of the data in this list.
    • (Insert data): Single row or bulk insert into the list.
    • (Delete): Select one or more rows to delete.
    • (Export): Export the list to Excel, text, or script.
    • Click Design to see and edit the set of fields and properties that define the list.
    • Click Delete All Rows to empty the data from the list without actually deleting the list structure itself.
    • (Print): Print the list data.

View History

From the > Manage Lists page, click View History for any list to see a summary of audit events for that particular list. You'll see both:

  • List Events: Change to the content of the list.
  • List Design Changes: Changes to the structure of the list.
For changes to data, you will see a Comment "An existing list record was modified". If you hover, then click the (details) link for that row, you will see the details of what was modified.

Related Topics


Premium Resource Available

Subscribers to premium editions of LabKey Server can how to add an additional index to a list in this topics:


Learn more about premium editions




Export/Import a List Archive


You can copy some or all of the lists in a folder to another folder or another server using export and import. Exporting a list archive packages up selected lists into a list archive: a .lists.zip file that conforms to the LabKey list export format. The process is similar to study export/import/reload. Information on the list serialization format is covered as part of Study Object Files and Formats.

Export

To export lists in a folder to a list archive:

  • In the folder that contains lists of interest, select (Admin) > Manage Lists.
  • Use the checkboxes to select the lists of interest. Check the box at the top of the column to select all.
  • Click Export List Archive.
  • All selected lists are exported into a zip archive.

Import

To import a list archive:

  • In the folder where you would like to import the list archive, select (Admin) > Manage Lists.
  • Select Import List Archive.
  • Click Choose File or Browse and select the .zip file that contains your list archive.
  • Click Import List Archive.
  • You will see the imported lists included on the Available Lists page.

Note: Existing lists will be replaced by lists in the archive with the same name; this could result in data loss and cannot be undone.

Auto-Incrementing Key Considerations

Exporting a list with an auto-increment key may result in different key values on import. If you have lookup lists make sure they use an integer or string key instead of an auto-increment key.

Related Topics




R Reports


You can leverage the full power of the R statistical programming environment to analyze and visualize datasets on LabKey Server. The results of R scripts can be displayed in LabKey reports that reflect live data updated every time the script is run. Reports may contain text, tables, or charts created using common image formats such as jpeg, png and gif. In addition, the Rlabkey package can be used to insert, update and/or delete data stored on a LabKey Server using R, provided you have sufficient permissions to do so.

An administrator must install and configure R on LabKey Server and grant access to users to create and run R scripts on live datasets. Loading of additional packages may also be necessary, as described in the installation topic. Configuration of multiple R engines on a server is possible, but within any folder only a single R engine configuration can be used.

Topics

Related Topics




R Report Builder


This topic describes how to build reports in the R statistical programming environment to analyze and visualize datasets on LabKey Server. The results of R scripts can be displayed in LabKey reports that reflect live data updated every time the script is run.
Permissions: Creating R Reports requires that the user have both the "Editor" role (or higher) and developer access (one of the roles "Platform Developer" or "Trusted Analyst") in the container. Learn more here: Developer Roles.

Create an R Report from a Data Grid

R reports are ordinarily associated with individual data grids. Choose the dataset of interest and further filter the grid as needed. Only the portion of the dataset visible within this data grid become part of the analyzed dataset.

To use the sample dataset we describe in this tutorial, please Tutorial: Set Up a New Study if you have not already done so. Alternately, you may simply add the PhysicalExam.xls demo dataset to an existing study for completing the tutorial. You may also work with your own dataset, in which case steps and screencaps will differ.

  • View the "Physical Exam" dataset in a LabKey study.
  • If you want to filter the dataset and thus select a subset or rearrangement of fields, select or create a custom grid view.
  • Select (Charts/Reports) > Create R Report.

If you do not see the "Create R Report" menu, check to see that R is installed and configured on your LabKey Server. You also need to have the correct permissions to create R Reports. See Configure Scripting Engines for more information.

Create an R Report Independent of any Data Grid

R reports do not necessarily need to be associated with individual data grids. You can also create an R report that is independent of any grid:

  • Select (Admin) > Manage Views.
  • Select Add Report > R Report.

R reports associated with a grid automatically load the grid data into the object "labkey.data". R reports created independently of grids do not have access to labkey.data objects. R reports that pull data from additional tables (other than the associated grid) must use the Rlabkey API to access the other table(s). For details on using Rlabkey, see Rlabkey Package. By default, R reports not associated with a grid are listed under the Uncategorized heading in the list on the Manage Views page.

Review the R Report Builder

The R report builder opens on the Source tab which looks like this. Enter the R script for execution or editing into the Script Source box. Notice the options available below the source entry panel, describe below.

Report Tab

When you select the Report tab, you'll see the resulting graphics and console output for your R report. If the pipeline option is not selected, the script will be run in batch mode on the server.

Data Tab

Select the data tab to see the data on which your R report is based. This can be a helpful resource as you write or refine your script.

Source Tab

When your script is complete and report is satisfactory, return to the Source tab, scroll down, and click Save to save both the script and the report you generated.

A saved report will look similar to the results in the design view tab, minus the help text. Reports are saved on the LabKey Server, not on your local file system. They can be accessed through the Reports drop-down menu on the grid view of you dataset, or directly from the Data Views web part.

The script used to create a saved report becomes available to source() in future scripts. Saved scripts are listed under the “Shared Scripts” section of the LabKey R report builder.

Help Tab

This Syntax Reference list provides a quick summary of the substitution parameters for LabKey R. See Input/Output Substitutions Reference for further details.

Additional Options

On the Source Tab you can expand additional option sections. Not all options are available to all users, based on permission roles granted.

Options

  • Make this report available to all users: Enables other users to see your R report and source() its associated script if they have sufficient permissions. Only those with read privileges to the dataset can see your new report based on it.
  • Show source tab to all users: This option is available if the report itself is shared.
  • Make this report available in child folders: Make your report available in data grids in child folders where the schema and table are the same as this data grid.
  • Run this report in the background as a pipeline job: Execute your script asynchronously using LabKey’s Pipeline module. If you have a big job, running it on a background thread will allow you to continue interacting with your server during execution.
If you choose the asynchronous option, you can see the status of your R report in the pipeline. Once you save your R report, you will be returned to the original data grid. From the Reports drop-down menu, select the report you just saved. This will bring up a page that shows the status of all pending pipeline jobs. Once your report finishes processing, you can click on “COMPLETE” next to your job. On the next page you’ll see "Job Status." Click on Data to see your report.

Note that reports are always generated from live data by re-running their associated scripts. This makes it particularly important to run computationally intensive scripts as pipeline jobs when their associated reports are regenerated often.

Knitr Options

  • Select None, HTML, or Markdown processing of HTML source
  • For Markdown, you can also opt to Use advanced rmarkdown output_options.
    • Check the box to provide customized output_options to be used.
    • If unchecked, rmarkdown will use the default output format:
      html_document(keep_md=TRUE, self_contained=FALSE, fig_caption=TRUE, theme=NULL, css=NULL, smart=TRUE, highlight='default')
  • Add a semi-colon delimited list of JavaScript, CSS, or library dependencies if needed.
Report Thumbnail
  • Choose to auto-generate a default thumbnail if desired. You can later edit the thumbnail or attach a custom image. See Manage Views.
Shared Scripts
  • Once you save an R report, its associated script becomes available to execute using source(“<Script Name>.R”) in future scripts.
  • Check the box next to the appropriate script to make it available for execution in this script.
Study Options
  • Participant Chart: A participant chart shows measures for only one participant at a time. Select the participant chart checkbox if you would like this chart to be available for review participant-by-participant.
  • Automatically cache this report for faster reloading: Check to enable.
Click Save to save settings, or Save As to save without disturbing the original saved report.

Example

Regardless of where you have accessed the R report builder, you can create a first R report which is data independent. This sample was adapted from the R help files.

  • Paste the following into the Source tab of the R report builder.
options(echo=TRUE);
# Execute 100 Bernoulli trials;
coin_flip_results = sample(c(0,1), 100, replace = TRUE);
coin_flip_results;
mean(coin_flip_results);
  • Click the Report tab to run the source and see your results, in this case the coin flip outcomes.

Add or Suppress Console Output

The options covered below can be included directly in your R report. There are also options related to console output in the scripting configuration for your R engine.

Echo to Console

By default, most R commands do not generate output to the console as part of your script. To enable output to console, use the following line at the start of your scripts:

options(echo=TRUE);

Note that when the results of functions are assigned, they are also not printed to the console. To see the output of a function, assign the output to a variable, then just call the variable. For further details, please see the FAQs for LabKey R Reports.

Suppress Console Output

To suppress output to the console, hiding it from users viewing the script, first remove the echo statement shown above. You can also include sink to redirect any outputs to 'nowhere' for all or part of your script.

To suppress output, on Linux/Mac/Unix, use:

sink("/dev/null")

On Windows use:

sink("NUL")

When you want to restart output to the console within the script, use sink again with no argument:

sink()

Related Topics




Saved R Reports


Saved R reports may be accessed from the source data grid or from the Data Views web part. This topic describes how to manage saved R reports and how they can be shared with other users (who already have access to the underlying data).

Performance Note

Once saved, reports are generated by re-running their associated scripts on live data. This ensures users always have the most current views, but it also requires computational resources each time the view is opened. If your script is computationally intensive, you can set it to run in the background so that it does not overwhelm your server when selected for viewing. Learn more in this topic: R Report Builder.

Edit an R Report Script

Open your saved R report by clicking the name in the data views web part or by selecting it from the (Charts and Reports) menu above the data grid on which it is based. This opens the R report builder interface on the Data tab. Select the Source tab to edit the script and manage sharing and other options. Click Save when finished.

Share an R Report

Saved R Reports can be kept private to the author, or shared with other users, either with all users of the folder, or individually with specific users. Under Options in the R report builder, use the Make this report available to all users checkbox to control how the report is shared.

  • If the box is checked, the report will be available to any users with "Read" access (or higher) in the folder. This access level is called "public" though that does not mean shared with the general public (unless they otherwise have "Read" access).
  • If the box is unchecked, the report is "private" to the creator, but can still be explicitly shared with other individual users who have access to the folder.
  • An otherwise "private" report that has been shared with individual users or groups has the access level "custom".
When sharing a report, you are indicating that you trust the recipient(s), and your recipients confirm that they trust you when they accept it. Sharing of R reports is audited and can be tracked in the "Study events" audit log.

Note that if a report is "public", i.e was made available to all users, you can still use this mechanism to email a copy of it to a trusted individual, but that will not change the access level of the report overall.

  • Open an R Report, from the Data Views web part, and click (Share Report).
  • Enter the Recipients email addresses, one per line.
  • The default Message Subject and Message Body are shown. Both can be customized as needed.
  • The Message Link is shown; you can click Preview Link to see what the recipient will see.
  • Click Submit to share the report. You will be taken to the permissions page.
  • On the Report and View Permissions page, you can see which groups and users already had access to the report.
    • Note that you will not see the individuals you are sharing the report with unless the access level of it was "custom" or "private" prior to sharing it now.
  • Click Save.

Recipients will receive a notification with a link to the report, so that they may view it. If the recipient has the proper permissions, they will also be able to edit and save their own copy of the report. If the author makes the source tab visible, recipients of a shared report will be able to see the source as well as the report contents. Note that if the recipient has a different set of permissions, they may see a different set of data. Modifications that the original report owner makes to the report will be reflected in the link as viewed by the recipient.

When an R report was private but has been shared, the data browser will show access as "custom". Click custom to open the Report Permissions page, where you can see the list of groups and users with whom the report was shared.

Learn more about report permissions in this topic: Configure Permissions for Reports & Views

Delete an R Report

You can delete a saved report by first clicking the pencil icon at the top of the Data Views web part, then click the pencil to the left of the report name. In the popup window, click Delete. You can also multi-select R reports for deletion on the Manage Views page.

Note that deleting a report eliminates its associated script from the "Shared Scripts" list in the R report interface. Make sure that you don’t delete a script that is called (sourced) by other scripts you need.

Related Topics




R Reports: Access LabKey Data


Access Your Data as "labkey.data"

LabKey Server automatically reads your chosen dataset into a data frame called labkey.data using Input Substitution.

A data frame can be visualized as a list with unique row names and columns of consistent lengths. Column names are converted to all lower case, spaces or slashes are replaced with underscores, and some special characters are replaced with words (i.e. "CD4+" becomes "cd4_plus_"). You can see the column names for the built in labkey.data frame by calling:

options(echo=TRUE);
names(labkey.data);

Just like any other data.frame, data in a column of labkey.data can be referenced by the column's name, converted to all lowercase and preceded by a $:

labkey.data$<column name>

For example, labkey.data$pulse; provides all the data in the Pulse column. Learn more about column references below.

Note that the examples in this section frequently include column names. If you are using your own data or a different version of LabKey example data, you may need to retrieve column names and edit the code examples given.

Use Pre-existing R Scripts

To use a pre-existing R script with LabKey data, try the following procedure:

  • Open the R Report Builder:
    • Open the dataset of interest ("Physical Exam" for example).
    • Select > Create R Report.
  • Paste the script into the Source tab.
  • Identify the LabKey data columns that you want to be represented by the script, and load those columns into vectors. The following loads the Systolic Blood Pressure and Diastolic Blood Pressure columns into the vectors x and y:
x <- labkey.data$diastolicbp;
y <- labkey.data$systolicbp;

png(filename="${imgout:myscatterplot}", width = 650, height = 480);
plot(x,
y,
main="Scatterplot Example",
xlab="X Axis ",
ylab="Y Axis",
pch=19);
abline(lm(y~x), col="red") # regression line (y~x);
  • Click the Report tab to see the result:

Find Simple Means

Once you have loaded your data, you can perform statistical analyses using the functions/algorithms in R and its associated packages. For example, calculate the mean Pulse for all participants.

options(echo=TRUE);
names(labkey.data);
labkey.data$pulse;
a <- mean(labkey.data$pulse, na.rm= TRUE);
a;

Find Means for Each Participant

The following simple script finds the average values of a variety of physiological measurements for each study participant.

# Get means for each participant over multiple visits;

options(echo=TRUE);
participant_means <- aggregate(labkey.data, list(ParticipantID = labkey.data$participantid), mean, na.rm = TRUE);
participant_means;

We use na.rm as an argument to aggregate in order to calculate means even when some values in a column are NA.

Create Functions in R

This script shows an example of how functions can be created and called in LabKey R scripts. Before you can run this script, the Cairo package must be installed on your server. See Install and Set Up R for instructions.

Note that the second line of this script creates a "data" copy of the input file, but removes all participant records that contain an NA entry. NA entries are common in study datasets and can complicate display results.

library(Cairo);
data= na.omit(labkey.data);

chart <- function(data)
{
plot(data$pulse, data$pulse);
};

filter <- function(value)
{
sub <- subset(labkey.data, labkey.data$participantid == value);
#print("the number of rows for participant id: ")
#print(value)
#print("is : ")
#print(sub)
chart(sub)
}

names(labkey.data);
Cairo(file="${imgout:a}", type="png");
layout(matrix(c(1:4), 2, 2, byrow=TRUE));
strand1 <- labkey.data[,1];
for (i in strand1)
{
#print(i)
value <- i
filter(value)
};
dev.off();

Access Data in Another Dataset (Select Rows)

You can use the Rlabkey library's selectRows to specify the data to load into an R data frame, including labkey.data, or a frame named something else you choose.

For example, if you use the following, you will load some example fictional data from our public demonstration site that will work with the above examples.

library(Rlabkey)
labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="PhysicalExam",
viewName="",
colNameOpt="rname"
)

Convert Column Names to Valid R Names

Include colNameOpt="rname" to have the selectRows call provide "R-friendly" column names. This converts column names to lower case and replaces spaces or slashes with underscores. Note that this may be different from the built in column name transformations in the built in labkey.data frame. The built in frame also substitutes words for some special characters, i.e. "CD4+" becomes "cd4_plus_", so during report development you'll want to check using names(labkey.data); to be sure your report references the expected names.

Learn more in the Rlabkey Documentation.

Select Specific Columns

Use the colSelect option with to specify the set of columns you want to add to your dataframe. Make sure there are no spaces between the commas and column names.

In this example, we load some fictional example data, selecting only a few columns of interest.

library(Rlabkey)
labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="Demographics",
viewName="",
colSelect="ParticipantId,date,cohort,height,Language",
colFilter=NULL,
containerFilter=NULL,
colNameOpt="rname"
)

Display Lookup Target Columns

If you load the above example, and then execute: labkey.data$language; you will see all the data in the "Language" column.

Remember that in an R data frame, columns are referenced in all lowercase, regardless of casing in LabKey Server. For consistency in your selectRows call, you can also define the colSelect list in all lowercase, but it is not required.

In this case, it will return a series of integers, because "Language" is a lookup column that references a list in the same container with an incrementing integer primary key.

If you want to access a column that is not the primary key in the lookup target, such as human-readable display values in this example, use syntax like this in your selectRows:

library(Rlabkey)
labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="Demographics",
viewName="",
colSelect="ParticipantId,date,cohort,height,Language,Language/LanguageName,Language/TranslatorName,Language/TranslatorPhone",
colFilter=NULL,
containerFilter=NULL,
colNameOpt="rname"
)

You can now retrieve human-readable values from within the "Language" list by converting everything to lowercase and substituting an underscore for the slash. Executing labkey.data$language_languagename; will return the list of language names.

Access URL Parameters and Data Filters

While you are developing your report, you can acquire any URL parameters as well as any filters applied on the Data tab by using labkey.url.params.

For example, if you filter the "systolicBP" column to values over 100, then use:

print(labkey.url.params}

...your report will include:

$`Dataset.systolicBP~gt`
[1] "100"

Write Result File to File Repository

The following report, when run, creates a result file in the server's file repository. Note that fileSystemPath is an absolute file path. To get the absolute path, see Using the Files Repository.

fileSystemPath = "/labkey/labkey/MyProject/Subfolder/@files/"
filePath = paste0(fileSystemPath, "test.tsv");
write.table(labkey.data, file = filePath, append = FALSE, sep = "t", qmethod = "double", col.names=NA);
print(paste0("Success: ", filePath));

Related Topics




Multi-Panel R Plots


The scripts on this page take the analysis techniques introduced in R Reports: Access LabKey Data one step further, still using the Physical Exam sample dataset. This page covers a few more strategies for finding means, then shows how to graph these results and display least-squares regression lines.

Find Mean Values for Each Participant

Finding the mean value for physiological measurements for each participant across all visits can be done in various ways. Here, we cover three alternative methods.

For all methods, we use "na.rm=TRUE" as an argument to aggregate in order to ignore null values when we calculate means.

DescriptionCode
Aggregate each physiological measurement for each participant across all visits; produces an aggregated list with two columns for participantid.
data_means <- aggregate(labkey.data, list(ParticipantID = 
labkey.data$participantid), mean, na.rm = TRUE);
data_means;
Aggregate only the pulse column and display two columns: one listing participantIDs and the other listing mean values of the pulse column for each participant
aggregate(list(Pulse = labkey.data$pulse), 
list(ParticipantID = labkey.data$participantid), mean, na.rm = TRUE);
Again, aggregate only the pulse column, but here results are displayed as rows instead of two columns.
participantid_factor <- factor(labkey.data$participantid);
pulse_means <- tapply(labkey.data$pulse, participantid_factor,
mean, na.rm = TRUE);
pulse_means;

Create Single Plots

Next we use R to create plots of some other physiological measurements included in our sample data.

All scripts in this section use the Cairo package. To convert these scripts to use the png() function instead, eliminate the call "library(Cairo)", change the function name "Cairo" to "png," change the "file" argument to "filename," and eliminate the "type="png"" argument entirely.

Scatter Plot of All Diastolic vs All Systolic Blood Pressures

This script plots diastolic vs. systolic blood pressures without regard for participantIDs. It specifies the "ylim" parameter for plot() to ensure that the axes used for this graph match the next graph's axes, easing interpretation.

library(Cairo);
Cairo(file="${imgout:diastol_v_systol_figure.png}", type="png");
plot(labkey.data$diastolicbloodpressure, labkey.data$systolicbloodpressure,
main="R Report: Diastolic vs. Systolic Pressures: All Visits",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200));
abline(lsfit(labkey.data$diastolicbloodpressure, labkey.data$systolicbloodpressure));
dev.off();

The generated plot, where the identity of participants is ignored, might look like this:

Scatter Plot of Mean Diastolic vs Mean Systolic Blood Pressure for Each Participant

This script plots the mean diastolic and systolic blood pressure readings for each participant across all visits. To do this, we use "data_means," the mean value for each physiological measurement we calculated earlier on a participant-by-participant basis.

data_means <- aggregate(labkey.data, list(ParticipantID = 
labkey.data$participantid), mean, na.rm = TRUE);
library(Cairo);
Cairo(file="${imgout:diastol_v_systol_means_figure.png}", type="png");
plot(data_means$diastolicbloodpressure, data_means$systolicbloodpressure,
main="R Report: Diastolic vs. Systolic Pressures: Means",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200));
abline(lsfit(data_means$diastolicbloodpressure, data_means$systolicbloodpressure));
dev.off();

This time, the plotted regression line for diastolic vs. systolic pressures shows a non-zero slope. Looking at our data on a participant-by-participant basis provides insights that might be obscured when looking at all measurements in aggregate.

Create Multiple Plots

There are two ways to get multiple images to appear in the report produced by a single script.

Single Plot Per Report Section

The first and simplest method of putting multiple plots in the same report places separate graphs in separate sections of your report. Use separate pairs of device on/off calls (e.g., png() and dev.off()) for each plot you want to create. You have to make sure that the {imgout:} parameters are unique. Here's a simple example:

png(filename="${imgout:labkeyl_png}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R: Report Section 1");
dev.off();

png(filename="${imgout:labkey2_png}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R: Report Section 2");
dev.off();

Multiple Plots Per Report Section

There are various ways to place multiple plots in a single section of a report. Two examples are given here, the first using par() and the second using layout().

Example: Four Plots in a Single Section: Using par()

This script demonstrates how to put multiple plots on one figure to create a regression panel layout. It uses standard R libraries for the arrangement of plots, and Cairo for creation of the plot image itself. It creates a single graphics file but partitions the ‘surface’ of the image into multiple sections using the mfrow and mfcol arguments to par().

library(Cairo);
data_means <- aggregate(labkey.data, list(ParticipantID =
labkey.data$participantid), mean, na.rm = TRUE);
Cairo(file="${imgout:multiplot.png}", type="png")
op <- par(mfcol = c(2, 2)) # 2 x 2 pictures on one plot
c11 <- plot(data_means$diastolicbloodpressure, data_means$weight, ,
xlab="Diastolic Blood Pressure (mm Hg)", ylab="Weight (kg)",
mfg=c(1, 1))
abline(lsfit(data_means$diastolicbloodpressure, data_means$weight))
c21 <- plot(data_means$diastolicbloodpressure, data_means$systolicbloodpressure, ,
xlab="Diastolic Blood Pressure (mm Hg)",
ylab="Systolic Blood Pressure (mm Hg)", mfg= c(2, 1))
abline(lsfit(data_means$diastolicbloodpressure, data_means$systolicbloodpressure))
c21 <- plot(data_means$diastolicbloodpressure, data_means$pulse, ,
xlab="Diastolic Blood Pressure (mm Hg)",
ylab="Pulse Rate (Beats/Minute)", mfg= c(1, 2))
abline(lsfit(data_means$diastolicbloodpressure, data_means$pulse))
c21 <- plot(data_means$diastolicbloodpressure, data_means$temp, ,
xlab="Diastolic Blood Pressure (mm Hg)",
ylab="Temperature (Degrees C)", mfg= c(2, 2))
abline(lsfit(data_means$diastolicbloodpressure, data_means$temp))
par(op); #Restore graphics parameters
dev.off();

Example: Three Plots in a Single Section: Using layout()

This script uses the standard R libraries to display multiple plots in the same section of a report. It uses the layout() command to arrange multiple plots on a single graphics surface that is displayed in one section of the script's report.

The first plot shows blood pressure and weight progressing over time for all participants. The lower scatter plots graph blood pressure (diastolic and systolic) against weight.

library(Cairo);
Cairo(file="${imgout:a}", width=900, type="png");
layout(matrix(c(3,1,3,2), nrow=2));
plot(weight ~ systolicbloodpressure, data=labkey.data);
plot(weight ~ diastolicbloodpressure, data=labkey.data);
plot(labkey.data$date, labkey.data$systolicbloodpressure, xaxt="n",
col="red", type="n", pch=1);
points(systolicbloodpressure ~ date, data=labkey.data, pch=1, bg="light blue");
points(weight ~ date, data=labkey.data, pch=2, bg="light blue");
abline(v=labkey.data$date[3]);
legend("topright", legend=c("bpsys", "weight"), pch=c(1,2));
dev.off();

Related Topics




Lattice Plots


The "lattice" R package provides presentation-quality, multi-plot graphics. This page supplies a simple script to demonstrate the use of Lattice graphics in the LabKey R environment.

Before you can use the Lattice package, it must be installed on your server. You will load the lattice package at the start of every script that uses it:

library("lattice");

Display a Volcano

The Lattice Documentation on CRAN provides a Volcano script to demonstrate the power of Lattice. The script below has been modified to work on LabKey R:

library("lattice");  

p1 <- wireframe(volcano, shade = TRUE, aspect = c(61/87, 0.4),
light.source = c(10,0,10), zlab=list(rot=90, label="Up"),
ylab= "North", xlab="East", main="The Lattice Volcano");
g <- expand.grid(x = 1:10, y = 5:15, gr = 1:2);
g$z <- log((g$x^g$g + g$y^2) * g$gr);

p2 <- wireframe(z ~ x * y, data = g, groups = gr,
scales = list(arrows = FALSE),
drape = TRUE, colorkey = TRUE,
screen = list(z = 30, x = -60));

png(filename="${imgout:a}", width=500);
print(p1);
dev.off();

png(filename="${imgout:b}", width=500);
print(p2);
dev.off();

The report produced by this script will display two graphs that look like the following:

Related Topics




Participant Charts in R


You can use the Participant Chart checkbox in the R Report Builder to create charts that display your R report results on a participant-by-participant basis. If you wish to create a participant chart in a test environment, install the example study and use it as a development sandbox.

Create and View Simple Participant Charts

  • In the example study, open the PhysicalExam dataset.
  • Select (Charts/Reports) > Create R Report.
  • On the Source tab, begin with a script that shows data for all participants. Paste the following in place of the default content.
png(filename="${imgout:a}", width=900);
plot(labkey.data$systolicbp, labkey.data$date);
dev.off();
  • Click the Report tab to view the scatter plot data for all participants.
  • Return to the Source tab.
  • Scroll down and click the triangle to open the Study Options section.
  • Check Participant Chart.
  • Click Save.
  • Name your report "Participant Systolic" or another name you choose.

The participant chart option subsets the data that is handed to an R script by filtering on a participant ID. You can later step through per participant charts using this option. The labkey.data dataframe may contain one, or more rows of data depending on the content of the dataset you are working with. Next, reopen the R report:

  • Return to the data grid of the "PhysicalExam" dataset.
  • Select (Charts/Reports) > Participant Systolic (or the name you gave your report).
  • Click Previous Participant.
  • You will see Next Participant and Previous Participant links that let you step through charts for each participant:

Advanced Example: Create Participant Charts Using Lattice

You can create a panel of charts for participants using the lattice package. If you select the participant chart option on the source tab, you will be able to see each participant's panel individually when you select the report from your data grid.

The following script produces lattice graphs for each participant showing systolic blood pressure over time:

library(lattice);
png(filename="${imgout:a}", width=900);
plot.new();
xyplot(systolicbp ~ date| participantid, data=labkey.data,
type="a", scales=list(draw=FALSE));
update(trellis.last.object(),
strip = strip.custom(strip.names = FALSE, strip.levels = TRUE),
main = "Systolic over time grouped by participant",
ylab="Systolic BP", xlab="");
dev.off();

The following script produces lattice graphics for each participant showing systolic and diastolic blood pressure over time (points instead of lines):

library(lattice);
png(filename="${imgout:b}", width=900);
plot.new();

xyplot(systolicbp + diastolicbp ~ date | participantid,
data=labkey.data, type="p", scales=list(draw=FALSE));
update(trellis.last.object(),
strip = strip.custom(strip.names = FALSE, strip.levels = TRUE),
main = "Systolic & Diastolic over time grouped by participant",
ylab="Systolic/Diastolic BP", xlab="");
dev.off();

After you save these two R reports with descriptive names, you can go back and review individual graphs participant-by-participant. Use the (Reports) menu available on your data grid.

Related Topics




R Reports with knitr


The knitr visualization package can be used with R in either HTML or Markdown pages to create dynamic reports. This topic will help you get started with some examples of how to interweave R and knitr.

Topics

Install R and knitr

  • If you haven't already installed R, follow these instructions: Install R.
  • Open the R graphical user interface. On Windows, a typical location would be: C:\Program Files\R\R-3.0.2\bin\i386\Rgui.exe
  • Select Packages > Install package(s).... Select a mirror site, and select the knitr package.
  • OR enter the following: install.packages('knitr', dependencies=TRUE)
    • Select a mirror site and wait for the knitr installation to complete.

Develop knitr Reports

  • Go to the dataset you wish to visualize.
  • Select (Charts/Reports) > Create R Report.
  • On the Source tab, enter your HTML or Markdown page with knitr code. (Scroll down for example pages.)
  • Specify which source to process with knitr. Under knitr Options, select HTML or Markdown.
  • Select the Report tab to see the results.

Advanced Markdown Options

If you are using rmarkdown v2 and check the box to "Use advanced rmarkdown output_options (pandoc only)", you can enter a list of param=value pairs in the box provided. Enter the bolded portion of the following example, which will be enclosed in an "output_options=list()" call.

output_options=list(
  param1=value1,
  param2=value2

)

Supported options include those in the "html_document" output format. Learn more in the rmarkdown documentation here .

Sample param=value options you can include:

  • css: Specify a custom stylesheet to use.
  • fig_width and fig_height: Control the size of figures included.
  • fig_caption: Control whether figures are captioned.
  • highlight: Specify a syntax highlighting style, such as pygments, monochrome, haddock, or default. NULL will prevent syntax highlighting.
  • theme: Specify the Bootstrap theme to apply to the page.
  • toc: Set to TRUE to include a table of contents.
  • toc_float: Set to TRUE to float the table of contents to the left.
If the box is unchecked, or if you check the box but provide no param=value pairs, rmarkdown will use the default output format:
​html_document(
keep_md=TRUE,
self_contained=FALSE,
fig_caption=TRUE,
theme=NULL,
css=NULL,
smart=TRUE,
highlight="default")

Note that pandoc is only supported for rmarkdown v2, and some formats supported by pandoc are not supported here.

If you notice report issues, such as graphs showing as small thumbnails, you may need to upgrade your server's version of pandoc.

R/knitr Scripts in Modules

R script knitr reports are also available as custom module reports. The script file must have either a .rhtml or .rmd extension, for HTML or markdown documents, respectively. For a file-based module, place the .rhtml/.rmd file in the same location as .r files, as shown below. For module details, see Map of Module Files.

MODULE_NAME
reports/
schemas/
SCHEMA_NAME/
QUERY_NAME/
MyRScript.r -- R report
MyRScript.rhtml -- R/knitr report
MyRScript.rmd -- R/knitr report

Declaring Script Dependencies

To fully utilize the report designer (called the "R Report Builder" in the LabKey user interface), you can declare JavaScript or CSS dependencies for knitr reports. This ensures that the dependencies are downloaded before R scripts are run on the "reports" tab in the designer. If these dependencies are not specified then any JavaScript in the knitr report may not run correctly in the context of the script designer. Note that reports that are run in the context of the Reports web part will still render correctly without needing to explicitly define dependencies.

Reports can either be created via the LabKey Server UI in the report designer directly or included as files in a module. Reports created in the UI are editable via the Source tab of the designer. Open Knitr Options to see a text box where a semi-colon delimited list of dependencies can be entered. Dependencies can be external (via HTTP) or local references relative to the labkeyWebapp path on the server. In addition, the name of a client library may be used. If the reference does not have a .js or .css extension then it will be assumed to be a client library (somelibrary.lib.xml). The .lib.xml extension is not required. Like local references, the path to the client library is relative to the labkeyWebapp path.

File based reports in a module cannot be edited in the designer although the "source" tab will display them. However you can still add a dependencies list via the report's metadata file. Dependencies can be added to these reports by including a <dependencies> section underneath the <R> element. A sample metadata file:

<?xml version="1.0" encoding="UTF-8"?>
<ReportDescriptor xmlns="http://labkey.org/query/xml">
<label>My Knitr Report</label>
<description>Relies on dependencies to display in the designer correctly.</description>
<reportType>
<R>
<dependencies>
<dependency path="http://external.com/jquery/jquery-1.9.0.min.js"/>
<dependency path="knitr/local.js"/>
<dependency path="knitr/local.css"/>
</dependencies>
</R>
</reportType>
</ReportDescriptor>

The metadata file must be named <reportname>.report.xml and be placed alongside the report of the same name under (modulename/resources/reports/schemas/...).

HTML Example

To use this example:

  • Install the R package ggplot2
  • Install the Demo Study.
  • Create an R report on the dataset "Physical Exam"
  • Copy and paste the knitr code below into the Source tab of the R Report Builder.
  • Scroll down to the Knitr Options node, open the node, and select HTML.
  • Click the Report tab to see the knitr report.
<table>
<tr>
<td align='center'>
<h2>Scatter Plot: Blood Pressure</h2>
<!--begin.rcode echo=FALSE, warning=FALSE
library(ggplot2);
opts_chunk$set(fig.width=10, fig.height=6)
end.rcode-->
<!--begin.rcode blood-pressure-scatter, warning=FALSE, message=FALSE, echo=FALSE, fig.align='center'
qplot(labkey.data$diastolicbp, labkey.data$systolicbp,
main="Diastolic vs. Systolic Pressures: All Visits",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200), xlim=c(60,120), color=labkey.data$temp);
end.rcode-->
</td>
<td align='center'>
<h2>Scatter Plot: Body Temp vs. Body Weight</h2>
<!--begin.rcode temp-weight-scatter, warning=FALSE, message=FALSE, echo=FALSE, fig.align='center'
qplot(labkey.data$temp, labkey.data$weight,
main="Body Temp vs. Body Weight: All Visits",
xlab="Body Temp (C)", ylab="Body Weight (kg)", xlim=c(35,40), color=labkey.data$height);
end.rcode-->
</td>
</tr>
</table>

The rendered knitr report:

Markdown v2

Administrators can enable Markdown v2 when enlisting an R engine through the Views and Scripting Configuration page. When enabled, Markdown v2 will be used when rendering knitr R reports. If not enabled, Markdown v1 is used to execute the reports.

Independent installation is required of the following:

This will then enable using the Rmarkdown v2 syntax for R reports. The system does not currently perform any verification of the user's setup. If the configuration is enabled when enlisting the R engine, but the packages are not properly setup, the intended report rendering will fail.

Syntax differences are noted here: http://rmarkdown.rstudio.com/authoring_migrating_from_v1.html

Markdown v1 Example

# Scatter Plot: Blood Pressure
# The chart below shows data from all participants

```{r setup, echo=FALSE}
# set global chunk options: images will be 7x5 inches
opts_chunk$set(fig.width=7, fig.height=5)
```

```{r graphic1, echo=FALSE}
plot(labkey.data$diastolicbp, labkey.data$systolicbp,
main="Diastolic vs. Systolic Pressures: All Visits",
ylab="Systolic (mm Hg)", xlab="Diastolic (mm Hg)", ylim =c(60, 200));
abline(lsfit(labkey.data$diastolicbp, labkey.data$systolicbp));
```

Another example

# Scatter Plot: Body Temp vs. Body Weight
# The chart below shows data from all participants.

```{r graphic2, echo=FALSE}
plot(labkey.data$temp, labkey.data$weight,
main="Temp vs. Weight",
xlab="Body Temp (C)", ylab="Body Weight (kg)", xlim=c(35,40));
```

Related Topics




Input/Output Substitutions Reference


An R script uses input substitution parameters to generate the names of input files and to import data from a chosen data grid. It then uses output substitution parameters to either directly place image/data files in your report or to include download links to these files. Substitutions take the form of: ${param} where 'param' is the substitution. You can find the substitution syntax directly in the R Report Builder on the Help tab.

Input and Output Substitution Parameters

Valid Substitutions: 
input_data: <name>The input datset, a tab-delimited table. LabKey Server automatically reads your input dataset (a tab-delimited table) into the data frame called labkey.data. If you desire tighter control over the method of data upload, you can perform the data table upload yourself. The 'input data:' prefix indicates that the data file for the grid and the <name> substitution can be set to any non-empty value:
# ${labkey.data:inputTsv}
labkey.data <- read.table("inputTsv", header=TRUE, sep="\t");
labkey.data
imgout: <name>An image output file (such as jpg, png, etc.) that will be displayed as a Section of a View on LabKey Server. The 'imgout:' prefix indicates that the output file is an image and the <name> substitution identifies the unique image produced after you call dev.off(). The following script displays a .png image in a View:
# ${imgout:labkey1.png}
png(filename="labkeyl_png")
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R")
dev.off()
tsvout: <name>A TSV text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. For example:
# ${tsvout:tsvfile}
write.table(labkey.data, file = "tsvfile", sep = "\t",
qmethod = "double", col.names="NA")
txtout: <name>A text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. For example:
# ${txtout:tsvfile}
write.csv(labkey.data, file = "csvfile")
pdfout: <name>A PDF output file that can be downloaded from LabKey Server. The 'pdfout:' prefix indicates that he expected output is a pdf file. The <name> substitution identifies the unique file produced after you call dev.off().
# ${pdfout:labkey1.pdf}
pdf(file="labkeyl_pdf")
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R")
dev.off()
psout: <name>A postscript output file that can be downloaded from LabKey Server. The 'psout:' prefix indicates that the expected output is a postscript file. The <name> substitution identifies the unique file produced after you call dev.off().
# ${psout:labkeyl.eps}
postscript(file="labkeyl.eps", horizontal=FALSE, onefile=FALSE)
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R")
dev.off()
fileout: <name>A file output that can be downloaded from LabKey Server, and may be of any file type. For example, use fileout in the place of tsvout to allow users to download a TSV instead of seeing it within the page:
# ${fileout:tsvfile}
write.table(labkey.data, file = "tsvfile", sep = "\t",
qmethod = "double", col.names=NA)
htmlout: <name>A text file that is displayed on LabKey Server as a section within a View. The output is different from the txtout: replacement in that no html escaping is done. This is useful when you have a report that produces html output. No downloadable file is created:
txt <- paste("<i>Click on the link to visit LabKey:</i>
<a target='blank' href='http://www.labkey.org'>LabKey</a>"
)
# ${htmlout:output}
write(txt, file="output")
svgout: <name>An svg file that is displayed on LabKey Server as a section within a View. htmlout can be used to render svg outputs as well, however, using svgout will generate a more appropriate thumbnail image for the report. No downloadable file is created:
# ${svgout:output.svg}
svg("output.svg", width= 4, height=3)
plot(x=1:10,y=(1:10)^2, type='b')
dev.off()

Implicit Variables

Each R script contains implicit variables that are inserted before your source script. Implicit variables are R data types and may contain information that can be used by the source script.

Implicit variables: 
labkey.dataThe data frame into which the input dataset is automatically read. The code to generate the data frame is:
# ${input_data:inputFileTsv} 
labkey.data <- read.table("inputFileTsv", header=TRUE, sep="\t",
quote="", comment.char="")
Learn more in R Reports: Access LabKey Data.
labkey.url.pathThe path portion of the current URL which omits the base context path, action and URL parameters. The path portion of the URL: http://localhost:8080/labkey/home/test/study-begin.view would be: /home/test/
labkey.url.baseThe base portion of the current URL. The base portion of the URL: http://localhost:8080/labkey/home/test/study-begin.view would be: http://localhost:8080/labkey/
labkey.url.paramsThe list of parameters on the current URL and in any data filters that have been applied. The parameters are represented as a list of key / value pairs.
labkey.user.emailThe email address of the current user

Using Regular Expressions with Replacement Token Names

Sometimes it can be useful to have flexibility when binding token names to replacement parameters. This can be the case when a script generates file artifacts but does not know the file names in advance. Using the syntax: regex() in the place of a token name (where LabKey server controls the token name to file mapping) will result the following actions:

  • Any script generated files not mapped to a replacement will be evaluated against the file's name using the regex.
  • If a file matches the regex, it will be assigned to the replacement and rendered accordingly.
<replacement>:regex(<expression>)The following example will find all files generated by the script with the extension : '.gct'. If any are found they will be assigned and rendered to the replacement parameter (in this case as a download link).//
#${fileout:regex(.*?(\.gct))}

Cairo or GDD Packages

You may need to use the Cairo or GDD graphics packages in the place of jpeg() and png() if your LabKey Server runs on a "headless" Unix server. You will need to make sure that the appropriate package is installed in R and loaded by your script before calling either of these functions.

GDD() and Cairo() Examples. If you are using GDD or Cairo, you might use the following scripts instead:

library(Cairo);
Cairo(file="${imgout:labkeyl_cairo.png}", type="png");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();

library(GDD);
GDD(file="${imgout:labkeyl_gdd.jpg}", type="jpeg");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();

Deprecated LabKey-specific Syntax

Prior to release 18.1, file substitutions used a LabKey-specific syntax. The use of standard R syntax expected by RStudio as described above means the following syntax is deprecated and should be updated. You can also find this syntax within the R Report UI by selecting the Help tab, then Inline Syntax (Deprecated).

(Deprecated) Valid Substitutions: 
input_dataLabKey Server automatically reads your input dataset (a tab-delimited table) into the data frame called labkey.data. For tighter control over the method of data upload, or to modify the parameters of the read.table function, you can perform the data table upload yourself:
labkey.data <- read.table("${input_data}", header=TRUE);
labkey.data;
imgout: <name>An image output file (such as jpg, png, etc.) that will be displayed as a Section of a report on LabKey Server. The 'imgout:' prefix indicates that the output file is an image and the <name> substitution identifies the unique image produced after you call dev.off(). The following script displays a .png image in a report:
png(filename="${imgout:labkeyl_png}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();
tsvout: <name>A TSV text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. For example:
write.table(labkey.data, file = "${tsvout:tsvfile}", sep = "\t", 
qmethod = "double");
txtout: <name>A text file that is displayed on LabKey Server as a section within a report. No downloadable file is created. A CSV example:
write.csv(labkey.data, file = "${txtout:csvfile}");
pdfout: <name>A PDF output file that can be downloaded from LabKey Server.
pdf(file="${pdfout:labkeyl_pdf}");
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();
psout: <name>A postscript output file that can be downloaded from LabKey Server.
postscript(file="${psout:labkeyl_eps}", horizontal=FALSE, onefile=FALSE);
plot(c(rep(25,100), 26:75), c(1:100, rep(1, 50)), ylab= "L", xlab="LabKey",
xlim= c(0, 100), ylim=c(0, 100), main="LabKey in R");
dev.off();
fileout: <name>A file output that can be downloaded from LabKey Server, and may be of any file type. For example, use fileout in the place of tsvout to allow users to download a TSV instead of seeing it within the page:
write.table(labkey.data, file = "${fileout:tsvfile}", sep = "\t", qmethod = "double", col.names=NA);
 Another example shows how to send the output of the console to a file:
options(echo=TRUE);
sink(file = "${fileout:consoleoutput.txt}");
labkey.data;
htmlout: <name>A text file that is displayed on LabKey Server as a section within a report. The output is different from the txtout: replacement in that no html escaping is done. This is useful when you have a report that produces html output. No downloadable file is created:
txt <- paste("<i>Click on the link to visit LabKey:</i>
<a target='blank' href='http://www.labkey.org'>LabKey</a>"
)
write(txt, file="${htmlout:output}");
svgout: <name>An svg file that is displayed on LabKey Server as a section within a report. htmlout can be used to render svg outputs as well, however, using svgout will generate a more appropriate thumbnail image for the report. No downloadable file is created:
svg("${svgout:svg}", width= 4, height=3)
plot(x=1:10,y=(1:10)^2, type='b')
dev.off()

(Deprecated) Implicit Variables: 
labkey.dataThe data frame which the input dataset is automatically read into. The code to generate the data frame is:
labkey.data <- read.table("${input_data}", header=TRUE, sep="\t",
quote="", comment.char="")

Additional Reference




Tutorial: Query LabKey Server from RStudio


This tutorial shows you how to pull data directly from LabKey Server into RStudio for analysis and visualization.

Tutorial Steps:

Install RStudio

  • If necessary, install R version 3.0.1 or later. If you already have R installed, you can skip this step.
  • If necessary, install RStudio Desktop on your local machine. If you already have RStudio installed, you can skip this step.

Install Rlabkey Package

  • Open RStudio.
  • On the Console enter the following:
install.packages("Rlabkey")
  • Follow any prompts to complete the installation.

Query Public Data

  • This will auto-generate the R code that queries the Physical Exam data:
library(Rlabkey)

# Select rows into a data frame called 'labkey.data'

labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="PhysicalExam",
viewName="",
colSelect="ParticipantId,ParticipantVisit/Visit,date,weight_kg,temperature_C,systolicBP,diastolicBP,pulse",
colFilter=NULL,
containerFilter=NULL,
colNameOpt="rname"
)
  • Copy this code to your clipboard.
  • Go to RStudio, paste the code into the Console tab, and press the Enter key.
  • The query results will be displayed on the 'labkey.data' tab.
  • Enter the following line into the Console to visualize the data.
plot(labkey.data$systolicbp, labkey.data$diastolicbp)
  • Note that the auto-generated code will include any filters applied to the grid. For example, if you filter Physical Exam for records where temperature is greater than 37, then the auto-generated code will include a colFilter property as below:
library(Rlabkey)

# Select rows into a data frame called 'labkey.data'

labkey.data <- labkey.selectRows(
baseUrl="https://www.labkey.org",
folderPath="/Explore/Research Study",
schemaName="study",
queryName="PhysicalExam",
viewName="",
colSelect="ParticipantId,ParticipantVisit/Visit,date,weight_kg,temperature_C,systolicBP,diastolicBP,pulse",
colFilter=makeFilter(c("temperature_C", "GREATER_THAN", "37")),
containerFilter=NULL,
colNameOpt="rname"
)

Handling Login/Authentication

  • To query non-public data, that is, data that requires a login/authentication step to access, you have three options:

Query Non-public Data

  • Once you have set up authentication for RStudio, the process of querying is the same as above:
    • Go to the desired data grid. Apply filters if necessary.
    • Auto-generate the R code using Export > Script > R > Create Script.
    • Use that code in RStudio.

Related Topics




FAQs for LabKey R Reports


Overview

This page aims to answer common questions about configuring and using the LabKey Server interface for creating R Reports. Remember, an administrator must install and configure R on LabKey Server before users can create and run R scripts on live datasets.

Topics:

  1. library(), help() and data() don’t work
  2. plot() doesn’t work
  3. jpeg() and png() don’t work
  4. Does my report reflect live, updated data?
  5. Output is not printed when I source() a file or use a function
  6. Scripts pasted from documentation don't work in the LabKey R Script Builder
  7. LabKey Server becomes very, very slow when scripts execute
  8. Does R create security risks?
  9. Any good sources for advice on R scripting?

1. library(), help() and data() don’t work

LabKey Server runs R scripts in batch mode. Thus, on Windows machines it does not display the pop-up windows you would ordinarily see in R’s interpreted/interactive mode. Some functions that produce pop-ups (e.g., library()) have alternatives that output to the console. Some functions (e.g., help() and some forms of data()) do not.

Windows Workaround #1: Use alternatives that output to the console

library(): The library() command has a console-output alternative. To see which packages your administrator has made available, use the following:

installed.packages()[,0]
Windows Workaround #2: Call the function from a native R window

help(): It’s usually easy to keep a separate, native R session open and call help() from there. This works better for some functions than others. Note that you must install and load packages before asking for help() with them. You can also use the web-based documentation available on CRAN.

data(): You can also call data() from a separate, native R session for some purposes. Calling data() from such a session can tell you which datasets are available on any packages you’ve installed and loaded in that instance of R, but not your LabKey installation.

2. plot() doesn’t work

Did you open a graphics device before calling plot()?

LabKey Server executes R scripts in batch mode. Thus, LabKey R never automatically opens an appropriate graphics device for output, as would R when running in interpreted/interactive mode. You’ll need to open the appropriate device yourself. For onscreen output that becomes part of a report, use jpeg() or png() (or their alternatives, Cairo(), GDD() and bitmap()). In order to output a graphic as a separate file, use pdf() or postscript().

Did you call dev.off() after plotting?

You need to call dev.off() when you’re done plotting to make sure the plot object gets printed to the open device.

3. jpeg() and png() don’t work

R is likely running in a headless Unix server. On a headless Unix server, R does not have access to the appropriate X11 drivers for the jpeg() and png() functions. Your admin can install a display buffer on your server to avoid this problem. Otherwise, in each script you will need to load the appropriate package to create these file formats via other functions (e.g., GDD or Cairo). See also: Determine Available Graphing Functions for help getting unstuck.

4. Does my report reflect live, updated data?

Yes. In general, LabKey always re-runs your saved script before displaying its associated report. Your script operates on live, updated data, so its plots and tables reflect fresh data.

In study folders, you can set a flag for any script that prevents the script from being re-run unless changes have occurred. This flag can save time when scripts are time-intensive or datasets are large making processing slow. When this flag is set, LabKey will only re-run the R script if:

  • The flag is cleared OR
  • The dataset associated with the script has changed OR
  • Any of the attributes associated with the script are changed (script source, options etc.)
To set the flag, check the "Automatically cache this report for faster reloading" checkbox under "Study Options" on the Source tab of the R report builder.

5. Output is not printed when I source() a file or use a function

The R FAQ explains:

When you use… functions interactively at the command line, the result is automatically printed...In source() or inside your own functions you will need an explicit print() statement.

When a command is executed as part of a file that is sourced, the command is evaluated but its results are not ordinarily printed. For example, if you call source(scriptname.R) and scriptname.R calls installed.packages()[,0] , the installed.packages()[,0] command is evaluated, but its results are not ordinarily printed. The same thing would happen if you called installed.packages()[,0] from inside a function you define in your R script.

You can force sourced scripts to print the results of the functions they call. The R FAQ explains:

If you type `1+1' or `summary(glm(y~x+z, family=binomial))' at the command line the returned value is automatically printed (unless it is invisible()). In other circumstances, such as in a source()'ed file or inside a function, it isn't printed unless you specifically print it.
To print the value 1+1, use
print(1+1);
or, instead, use
source("1plus1.R", echo=TRUE);
where "1plus1.R" is a shared, saved script that includes the line "1+1".

6. Scripts pasted from documentation don't work in the LabKey R report builder

If you receive an error like this:

Error: syntax error, unexpected SYMBOL, expecting 'n' or ';'
in "library(Cairo) labkey.data"
Execution halted
please check your script for missing line breaks. Line breaks are known to be unpredictably eliminated during cut/paste into the script builder. This issue can be eliminated by ensuring that all scripts have a ";" at the end of each line.

7. LabKey Server becomes very, very slow when scripts execute

You are probably running long, computationally intensive scripts. To avoid a slowdown, run your script in the background via the LabKey pipeline. See R Report Builder for details on how to execute scripts via the pipeline.

8. Does R Create Security Risks?

Allowing the use of R scripts/reports on a server can be a security risk. A developer could write a script that could read or write any file stored in any SiteRoot, fileroot or pipeline root despite the LabKey security settings for that file.

A user must have at least the Author role as well as either the Platform Developer or Trusted Analyst role to write a R script or report to be used on the server.

R should not be used on a "shared server", that is, a server where users with admin/developer privileges in one project do not have permissions on other projects. Running R on the server could pose a security threat if the user attempts to access the server command line directly. The main way to execute a system command in R is via the 'system(<system call>)' method that is part of the R core package. The threat is due to the permission level of a script being run by the server possibly giving unwanted elevated permissions to the user.

Administrators should investigate sandboxing their R configuration as a software management strategy to reduce these risks. Note that the LabKey Server will not enforce or confirm such sandboxing, but offers the admin a way of announcing that an R configuration is safe.

9. Any good sources for advice on R scripting?

R Graphics Basics: Plot area, mar (margins), oma (outer margin area), mfrow, mfcol (multiple figures)

  • Provides good advice how to make plots look spiffy
R graphics overview
  • This powerpoint provides nice visuals for explaining various graphics parameters in R
Bioconductor course materials
  • Lectures and labs cover the range - from introductory R to advanced genomic analysis

10. Graphics File Formats

If you don’t know which graphics file format to use for your plots, this link can help you narrow down your options.

.png and .gif

Graphics shared over the web do best in png when they contain regions of monotones with hard edges (e.g., typical line graphs). The .gif format also works well in such scenarios, but it is not supported in the default R installation because of patent issues. The GDD package allows you to create gifs in R.

.jpeg

Pictures with gradually varying tones (e.g., photographs) are successfully packaged in the jpeg format for use on the web.

.pdf and .ps or .eps

Use pdf or postscript when you aim to output a graph that can be accessed in isolation from your R report.



Premium RStudio Integration


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey.

The RStudio module lets you design and save reports using RStudio, instead of LabKey Server's built in R report designer.

There are two options for using RStudio with LabKey.

  • RStudio:
    • A single-user RStudio server running locally.
    • Follow the instructions in this topic to configure and launch RStudio: Connect to RStudio
  • RStudio Workbench (previously RStudio Server Pro):
    • If you have an RStudio Workbench license, you can run multiple sessions per user, and multiple versions of R.
    • First, the RStudio Workbench server must be set up and configured correctly, as described in this topic: Set Up RStudio Workbench
    • Then, integrate RStudio Workbench with LabKey following this topic: Connect to RStudio Workbench
Once either RStudio or RStudio Workbench is configured and enabled using one of the above methods, you can use it to edit R reports. You can also directly export data into a named variable. Learn more in these topics: With further configuration, you can access the LabKey schema through RStudio or RStudio Workbench. Learn more here:

Related Topics




Connect to RStudio


Premium Feature — This feature supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

This topic explains how administrators can set up communication between LabKey Server and an RStudio instance packaged inside a Docker image. As part of the configuration, we recommend that you allow Docker, not LabKey Server, to manage the RStudio data using Docker Volumes. This avoids a number of issues with file sharing, permissions, path mapping, and security, making setup much easier.

Install Docker

Docker is available for Windows 10, MacOS, and a variety of Linux platforms. The Docker installation documentation is available here:

Follow the instructions for installation on your platform.
  • On Windows, we recommend "Docker Desktop for Windows".
  • On OSX, "Docker Toolbox" is known to work.
Test your Docker installation using an example container, for example: 'docker run hello-world' (https://docs.docker.com/engine/tutorials/dockerizing/)

Connect Using UNIX SOCKET or TCP

Depending on your platform and server mode, you have several connection options:

Connect using UNIX SOCKET

On Mac and Linux, you can connect directly via the docker socket, usually at

unix:///var/run/docker.sock

Connect using TCP - Developer mode

If you are developing or testing locally, you can skip setting up TLS if both of the following are true:

  • Use port 2375 instead of 2376
  • Run with devmode=true
Otherwise a secure connection is required, as described in the next section, whether the docker server is local or remote.

Connect using TCP and TLS - Production mode

In production you must set up TLS if you are connecting over the network, and you may also want to set it up for certain kinds of development. Set up TLS following these steps:

  • See https://docs.docker.com/engine/security/https/
  • An example TLS and certificate configuration for LabKey Server is available in this companion topic:
  • After certificates are set up, find the values of the following environment variables. You will use these values in the Configure Docker step below. (On Linux run 'env | grep DOCKER'.)
    • DOCKER_CERT_PATH
    • DOCKER_HOST
  • On an Ubuntu system modify the "/etc/default/docker" file to include the following:
DOCKER_TLS_VERIFY=1

DOCKER_HOST=tcp://$HOST:2376

DOCKER_CERT_PATH=<YOUR_CERT_PATH>

DOCKER_OPTS="--dns 8.8.8.8 --tlsverify --tlscacert=$DOCKER_CERT_PATH/ca.pem --tlscert=$DOCKER_CERT_PATH/server-cert.pem --tlskey=$DOCKER_CERT_PATH/server-key.pem -H=0.0.0.0:2376"

Build the LabKey/RStudio Image

  • Clone the docker-rstudio repository from github: https://github.com/LabKey/docker-rstudio (This is not a module and should not be cloned into your LabKey server enlistment.)
  • cd to the rstudio-base directory:
cd docker-rstudio/images/labkey/rstudio-base
  • Build the image from the source files.
    • On Linux:
      ./make
    • On Windows
      make.bat
  • Test by running the following:
docker run labkey/rstudio-base

You may need to provide a password (generate a Session Key to use as a password) to perform this test, but note that this test is simulating how RStudio will be run from LabKey and you do not need to leave this container running or use a persistent password or API key.

docker run -e PASSWORD=<Your_Session_Key> -p 8787:8787 labkey/rstudio-base
  • Check console output for errors.
  • Ctrl-C to get to return to the command prompt once you see the 'done' message.
  • Stop this test container by running:
docker stop labkey/rstudio-base

Enable Session Keys

Your server must be configured to accept session keys in order for token authentication to RStudio to work. One time access codes will be generated for you.

You do not need to generate any session keys yourself to use RStudio.

Check Base Server URL

  • Go to (Admin) > Site > Admin Console.
  • Under Configuration, click Site Settings.
  • The Base Server URL cannot be localhost. For a local development machine, you can substitute your machine's actual IP address (obtain via ipconfig) in this entry.
    • Note that you may have to log out of "localhost" and log in to the actual IP address to use RStudio from a local server.

Configure Docker

For details, see Configure Docker Host.

  • If you have successfully created the Docker image, it will be listed under Docker Volumes.
  • When configuring Docker for RStudio, you can skip the prerequisite check for user home directories.
  • The configuration page will show a list of existing volumes and corresponding user emails. The email field is blank if the volume doesn't correspond to an existing user.

Configure RStudio

  • Go to (Admin) > Site > Admin Console.
  • Under Premium Features click RStudio Settings.
  • Select Use Docker Based RStudio.

Configure the fields as appropriate for your environment, then click Save.

  • Image Configuration
    • Docker Image Name: Enter the Docker image name to use for RStudio. Default is 'labkey/rstudio', which includes Rlabkey.
    • Port: Enter port the RStudio Server will listen on inside Docker. Default is 8787.
    • Mount Volume: Enter the volume inside the Docker container to mount as the RStudio user's home directory. Default is '/home/rstudio'.
    • Host library directory (read-only): Optional read-only mount point at /usr/lib/R/library
    • Additional mount: host/container directories (read-only): Optional read-only mount points
    • User for Containers: Optional user to use for running containers inside docker
    • AppArmor Profile: If needed
    • Local IP address for LabKey Server: Set to the IP address of the LabKey Server from within your docker container. This may be left blank if the normal DNS lookup works.
  • Resource Configuration: Customize defaults if needed.
    • Max User Count: Default is 10
    • Stop Container Timeouts (in hours)": Default is 1.0. Click to Stop Now.
    • Delete Container Timeouts (in hours)": Default is 120.0. Click to Delete Now.
  • RStudio initialization configuration:
Click Save when finished. A button to Restore Defaults is also available.

Enable RStudio in Folders

The modules docker and rstudio must be enabled in each folder where you want RStudio to be available. Select (Admin) > Folder > Management, select the Folder Type tab and make sure both modules are checked in the right column. Click Update Folder to save any changes.

Once RStudio is enabled, you can use the instructions in these topics to edit reports and export data:

Related Topics




Set Up Docker with TLS


Premium Feature — Docker supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

This topic outlines an example TLS and certificate configuration used to set up Docker for use with RStudio in LabKey Server. The main instructions for configuring Docker and RStudio are in this topic: Connect to RStudio

On a development machine, you do not have to set up TLS if both of the following are true:
  • Use port 2375 instead of 2376
  • Run with devmode=true
Otherwise a secure connection is required, whether the docker server is local or remote.

Installation Instructions for Docker Daemon

First, identify where the server will run the Docker Daemon on the Test Environment.

The Docker documentation includes the following recommendation: "...if you run Docker on a server, it is recommended to run exclusively Docker in the server, and move all other services within containers controlled by Docker. Of course, it is fine to keep your favorite admin tools (probably at least an SSH server), as well as existing monitoring/supervision processes (e.g., NRPE, collectd, etc)."

The options available include:

  1. Create a New Instance in the Test Environment for the Docker Daemon. This option is preferred in an ideal world, as it is the most secure and follows all Docker best practices. If a runaway RStudio process were to overwhelm the instance's resources it would not affect Rserve and the Web Server processes and thus the other applications would continue to function properly.
  2. Install and Run the Docker Daemon on the Rserve instance. This allows quick installation and configuration of Docker in the Test Environment.
  3. Install and Run the Docker Daemon on the Web Server.
For the Test Environment, option #2 is recommended as this allows us to test the RStudio features using a configuration similar to what will be used in Production.
  • If during the testing, we begin to see resource contention between RStudio and Rserve processes, we can change the Docker Daemon configuration to limit resources used by RStudio or request a new instance to run Docker.
  • If a runaway RStudio process overwhelms the instance's resources it will not affect the Web Server processes and thus the other applications will continue to function properly.

Install the Docker Daemon

Install the Docker Daemon on the Rserve instance by running the following commands:

sudo apt-get update 
sudo apt-get install apt-transport-https ca-certificates
sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D

sudo vi /etc/apt/sources.list.d/docker.list
[Added]
deb https://apt.dockerproject.org/repo ubuntu-trusty main

sudo apt-get update
sudo apt-get purge lxc-docker
sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
sudo apt-get install docker-engine

IMPORTANT: Do not add any users to the docker group (members of the docker group can access dockerd Linux socket).

Create TLS Certificates

The TLS certificates are used by the LabKey Server to authenticate to the Docker Daemon process.

Create the directory that will hold the CA certificate/key and the Client certificate/key. You can use a different directory if you want than the one shown below. This is the value of "DOCKER_CERT_PATH":

sudo su - 
mkdir -p /labkey/apps/ssl
chmod 700 /labkey/apps/ssl

Create the Certificate Authority private key and certificate

Create the CA key CA certificate. Configure the certificate to expire in 10 years.

openssl genrsa -out /labkey/apps/ssl/ca-key.pem 4096
openssl req -x509 -new -nodes -key /labkey/apps/ssl/ca-key.pem -days 3650 -out /labkey/apps/ssl/ca.pem -subj '/CN=docker-CA'

Create an openssl.cnf configuration file to be used by the CA.

vi /labkey/apps/ssl/openssl.cnf 
[added]

[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth, clientAuth

Create the Client TLS certificate

Create the client certificates and key. The client certificate will be good for 10 years.

openssl genrsa -out /labkey/apps/ssl/client-key.pem 4096
openssl req -new -key /labkey/apps/ssl/client-key.pem -out /labkey/apps/ssl/client-cert.csr -subj '/CN=docker-client' -config /labkey/apps/ssl/openssl.cnf
openssl x509 -req -in /labkey/apps/ssl/client-cert.csr -CA /labkey/apps/ssl/ca.pem -CAkey /labkey/apps/ssl/ca-key.pem -CAcreateserial -out /labkey/apps/ssl/client-cert.pem -days 3650 -extensions v3_req -extfile /labkey/apps/ssl/openssl.cnf

Create the TLS certificates to be used by the Docker Daemon

Create the directory from which the docker process will read the TLS certificate and key files.

mkdir /etc/docker/ssl 
chmod 700 /etc/docker/ssl/

Copy over the CA certificate.

cp /labkey/apps/ssl/ca.pem /etc/docker/ssl

Create the openssl.cnf configuration file to be used by the Docker Daemon. Note: Values for Test and Production are shown separated by "|" pipes. Keep only the option needed.

vi /etc/docker/ssl/openssl.cnf 
[added]

[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth, clientAuth
subjectAltName = @alt_names

[alt_names]
DNS.1 = yourtestweb | yourprodweb
DNS.2 = yourtestrserve | yourprodrserve
IP.1 = 127.0.0.1
IP.2 = 10.0.0.87 | 10.10.0.37

Create the key and certificate to be used by the Docker Daemon.

openssl genrsa -out /etc/docker/ssl/daemon-key.pem 4096
openssl req -new -key /etc/docker/ssl/daemon-key.pem -out /etc/docker/ssl/daemon-cert.csr -subj '/CN=docker-daemon' -config /etc/docker/ssl/openssl.cnf
openssl x509 -req -in /etc/docker/ssl/daemon-cert.csr -CA /etc/docker/ssl/ca.pem -CAkey /labkey/apps/ssl/ca-key.pem -CAcreateserial -out /etc/docker/ssl/daemon-cert.pem -days 3650 -extensions v3_req -extfile /etc/docker/ssl/openssl.cnf

Set the correct permission on the certificates.

chmod 600 /etc/docker/ssl/*

Change Docker Daemon Configuration

Note: Values for Test and Production are shown separated by "|" pipes. Keep only the option needed.

Change the Docker Daemon to use your preferred configuration. The changes are:

  • Daemon will listen Linux socket at "/var/run/docker.sock" and "tcp://10.0.1.204 | 10.10.1.74:2376"
  • Use TLS certificates on the TCP socket for encryption and authentication
  • Turn off inter-container communication
  • Set default limits on container usage and enable userland-proxy
These options were set creating the daemon.json configuration file at:
/etc/docker/daemon.json

vi /etc/docker/daemon.json 
[added]
{
"icc": false,
"tls": true,
"tlsverify": true,
"tlscacert": "/etc/docker/ssl/ca.pem",
"tlscert": "/etc/docker/ssl/daemon-cert.pem",
"tlskey": "/etc/docker/ssl/daemon-key.pem",
"userland-proxy": false,
"default-ulimit": "nofile=50:100",
"hosts": ["unix:///var/run/docker.sock", "tcp://10.0.1.204 | 10.10.1.74:2376"]
}

Start the Docker daemon by running:

service docker start

We can test if docker is running:

docker info

If it is running, you will see something similar to:

Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 1.12.1
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 0
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host overlay bridge null
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-92-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 489.9 MiB
Name: myubuntu
ID: WK6X:HLMO:K5IQ:MENK:ALKP:JN4Q:ALYL:32UC:Q2OD:ZNFG:XLZJ:4KPA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8

We can test if the TLS configuration is working by running:

docker -H 10.0.1.204 | 10.10.1.74:2376 --tls --tlscert=/labkey/apps/ssl/client-cert.pem --tlskey=/labkey/apps/ssl/client-key.pem  ps -a

This should output:

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

Install AppArmor Configuration to be used by Docker Containers

Create new custom profile named "docker-labkey-myserver".

vi /etc/apparmor.d/docker-labkey-myserver
[added]

#include <tunables/global>


profile docker-labkey-myserver flags=(attach_disconnected,mediate_deleted) {

#include <abstractions/base>

network inet tcp,
network inet udp,
deny network inet icmp,
deny network raw,
deny network packet,
capability,
file,
umount,

deny @{PROC}/* w, # deny write for all files directly in /proc (not in a subdir)
# deny write to files not in /proc/<number>/** or /proc/sys/**
deny @{PROC}/{[^1-9],[^1-9][^0-9],[^1-9s][^0-9y][^0-9s],[^1-9][^0-9][^0-9][^0-9]*}/** w,
deny @{PROC}/sys/[^k]** w, # deny /proc/sys except /proc/sys/k* (effectively /proc/sys/kernel)
deny @{PROC}/sys/kernel/{?,??,[^s][^h][^m]**} w, # deny everything except shm* in /proc/sys/kernel/
deny @{PROC}/sysrq-trigger rwklx,
deny @{PROC}/mem rwklx,
deny @{PROC}/kmem rwklx,
deny @{PROC}/kcore rwklx,

deny mount,

deny /sys/[^f]*/** wklx,
deny /sys/f[^s]*/** wklx,
deny /sys/fs/[^c]*/** wklx,
deny /sys/fs/c[^g]*/** wklx,
deny /sys/fs/cg[^r]*/** wklx,
deny /sys/firmware/efi/efivars/** rwklx,
deny /sys/kernel/security/** rwklx,


# suppress ptrace denials when using 'docker ps' or using 'ps' inside a container
ptrace (trace,read) peer=docker-labkey-myserver,


# Rules added by LabKey to deny running executables and accessing files
deny /bin/dash mrwklx,
deny /bin/bash mrwklx,
deny /bin/sh mrwklx,
deny /usr/bin/top mrwklx,
deny /usr/bin/apt* mrwklx,
deny /usr/bin/dpkg mrwklx,

deny /bin/** wl,
deny /boot/** wl,
deny /dev/[^null]** wl,
deny /lib/** wl,
deny /lib64/** wl,
deny /media/** wl,
deny /mnt/** wl,
deny /opt/** wl,
deny /proc/** wl,
deny /root/** wl,
deny /sbin/** wl,
deny /srv/** wl,
deny /sys/** wl,
deny /usr/[^local]** wl,
deny /teamcity/** rwklx,
deny /labkey/** rwklx,
deny /share/files/** rwklx,

}

Load the new profile.

apparmor_parser -r -W /etc/apparmor.d/docker-labkey-myserver

Now that the profile is loaded, we can force the container to use the profile by specifying it at run time, for example:

docker run --security-opt "apparmor=docker-labkey-myserver" -i -t  ubuntu /bin/bash

When starting a new Docker container manually, you should always use the "docker-labkey-myserver" profile. This is done by specifying the following option whenever a container is started:

--security-opt "apparmor=docker-labkey-myserver"

Install New Firewall Rules

Install file containing new firewall rules to block certain connections from running containers Note: Values for Test and Production are shown separated by "|" pipes. Keep only the option needed.

mkdir /etc/iptables.d
vi /etc/iptables.d/docker-containers
[added]

*filter
:INPUT ACCEPT
:FORWARD ACCEPT
:OUTPUT ACCEPT

-A INPUT -i eth0 -p tcp --destination-port 2376 -s 10.0.0.87 | 10.10.0.37/32 -j ACCEPT
-A INPUT -i docker0 -p tcp --destination-port 2376 -s 172.17.0.0/16 -j REJECT
-A INPUT -p tcp --destination-port 2376 -j REJECT
-A INPUT -i docker0 -p tcp --destination-port 6312 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p tcp --destination-port 22 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p tcp --destination-port 111 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p udp --destination-port 111 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p tcp --destination-port 1110 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p udp --destination-port 1110 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p tcp --destination-port 2049 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p udp --destination-port 2049 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p tcp --destination-port 4045 -s 172.17.0.0/16 -j REJECT
-A INPUT -i docker0 -p udp --destination-port 4045 -s 172.17.0.0/16 -j REJECT
-I DOCKER 1 -i eth0 ! -s 10.0.0.87/32 -j DROP
-I FORWARD 1 -i docker0 -p tcp --destination-port 80 -j ACCEPT
-I FORWARD 2 -i docker0 -p tcp --destination-port 443 -j ACCEPT
-I FORWARD 3 -i docker0 -p tcp --destination-port 53 -j ACCEPT
-I FORWARD 4 -i docker0 -p upd --destination-port 53 -j ACCEPT
-I FORWARD 5 -i docker0 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
-I FORWARD 6 -i docker0 -j REJECT
COMMIT

Create the Systemd Configuration

Install 'systemd', then create the systemd configuration file to ensure iptables rules are applied at startup. For more details see: https://docs.docker.com/config/daemon/systemd/

# Create a systemd drop-in directory for the docker service 
mkdir /etc/systemd/system/docker.service.d

# create a service file at
# vi /etc/systemd/system/docker.service.d/iptables.conf
# cp below
[Unit]
Description = Load iptables firewall changes

[Service]
ExecStartPost=/sbin/iptables-restore --noflush /etc/iptables.d/docker-containers

# flush changes
systemctl daemon-reload

# restart docker
systemctl restart docker

# check that iptables.conf was run
systemctl status docker.service

# check iptables
iptables -L

Start the new service.

initctl reload-configuration
initctl start iptables.service

Changes to RServe Instance to Support RStudio Usage

In order to support file sharing between the RStudio process and the LabKey Server we will need to:

  1. Create new user accounts on Rserve instance
  2. Create new groups on Rserve instance
  3. Modify permission on the /share/users directory

Add new groups and user accounts Rserve instance

Create the new group named "labkey-docker" this group is created to facilate the file sharing.

sudo groupadd -g 6000 labkey-docker
Create the new RStudio user. This will be the OS user which runs the RStudio session
sudo useradd -u 6005 -m -G labkey-docker rstudio

Create the directory which will hold the Home Directory Fileroots used by each server

We will use acls to ensure that newly created files/directories have the correct permissions. The ACLs only need to be set on the Rserve operating system. They do not need to specified or changed on the container.

These ACLs should only be specified on the LabKey Server Home Directory FileRoot. Never set these ACLs on any other directory on the Rserve. Sharing files between the docker host and container should only be done in the LabKey Server Home Directory FileRoot.

Install required package:

sudo apt-get install acl

Create the home dir fileroot and set the correct permissions using the newly created accounts. These instructions assume:

  1. Tomcat server is being run by the user "myserver"
  2. LabKey Server Home Directory FileRoot is located at "/share/users"
Run the following commands:

setfacl -R -m u:myserver:rwx /share/users
setfacl -Rd -m u:myserver:rwx /share/users
setfacl -Rd -m u:rstudio:rwx /share/users
setfacl -Rd -m g:labkey-docker:rwx /share/users

Changes to Web Server Instance to Support RStudio Usage

In order to support file sharing between the RStudio process and the LabKey Server we will need to:

  1. Create new user accounts on Rserve instance
  2. Create new groups on Rserve instance
  3. Ensure the NFS share "/share" is mounted with "acl" option.

Add new groups and user accounts Rserve instance

Create the new group named "labkey-docker" to facilitate file sharing.

sudo groupadd -g 6000 labkey-docker

Create the new RStudio user. This will be the OS user which runs the RStudio session.

sudo useradd -u 6005 -m -G labkey-docker rstudio

Related Topics




Connect to RStudio Workbench


Premium Feature — This feature supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

RStudio Workbench (previously RStudio Server Pro) supports running multiple RStudio sessions per user, and multiple versions of R simultaneously.

Prerequisite: Set Up RStudio Workbench

To set up RStudio Workbench to connect to LabKey, follow the instructions and guidance in this topic:

Once RStudio Workbench is correctly configured, the RStudio Workbench Admin should provide you with:
  • The absolute URL of RStudio Workbench
  • The Security Token customized to authenticate LabKey. Default is 'X-RStudio-Username'
You may also wish to confirm that RStudio Workbench was restarted after setting the above.

Version Compatibility

LabKey Server version 22.3 and 22.7 have both been confirmed to work with RStudio Workbench version 2022.02.2.

Configure RStudio Workbench in the Admin Console

  • Select (Admin) > Site > Admin Console.
  • Under Configuration, click Site Settings.
  • Set the Base Server URL (i.e. not "localhost").
  • Click Save.
  • Click Settings
  • Under Premium Features, click RStudio Settings.
  • Select Use RStudio Workbench.

Enter the Configuration values:

  • RStudio Workbench URL: The absolute URL of RStudio Workbench.
  • Authentication: Select either Email or Other User Attribute:
    • Email: Provide the User Email Suffix: Used for LabKey to RStudio Workbench user mapping. The RStudio Workbench user name is derived from removing this suffix (Ex. "@labkey.com") from the user's LabKey Server email address.
    • Other User Attribute: For example, LDAP authentication settings can be configured to retrieve user attributes at authentication time. Provide the Attribute Name for this option.
  • Security Token: The HTTP header name that RStudio is configured to use to indicate the authenticated user. Default is 'X-RStudio-Username'.
  • Initialize Schema Viewer: Check the box to enable the viewing of the LabKey schema from within RStudio. See Advanced Initialization of RStudio for more information.

Testing RStudio Workbench Connection

Before saving, you can use Test RStudio Workbench to confirm the connection and security token. When successful the message will read:

RStudio Workbench server connection test succeeded. Launch RStudio to verify $rStudioUserName$ is a valid RStudio user.

Proceed to launch RStudio Workbench via > Developer Links > RStudio Server. If the rstudioUserName you provided is valid, you will be logged in successfully. If not, you will see a warning message provided by RStudio Workbench.

Note that previous versions of Rstudio Server Workbench/Pro allowed LabKey to determine validity of the username. Find documentation for earlier versions in our archives: Testing RStudio Workbench/Pro Connections

Troubleshooting unsuccessful results of Test RStudio Workbench:

  • An error response may indicate the incorrect URL for RStudio Workbench was entered, or that URL is unreachable. Try reaching the URL in a browser window outside of LabKey.
  • A warning response may be due to an invalid security token.

Launch RStudio Workbench

  • Select (Admin) > Developer Links > RStudio Server.

When you launch RStudio Workbench, you will see your home page. If you have multiple sessions, click on one to select it.

Now you can use RStudio Workbench to edit R reports and export data. Learn more in these topics:

Related Topics




Set Up RStudio Workbench


Premium Feature — This feature supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

This topic covers setting up and configuring RStudio Workbench (previously RStudio Server Pro) for use with LabKey Server. RStudio Workbench runs as a separate server, and unlike RStudio, can shared by many simultaneous users. On the machine where RStudio Workbench is running, the R language must also be installed and the Rlabkey package loaded.

Typically the steps on this page would be completed by a system administrator, though not necessarily the LabKey site administrator. Once the steps described here have been completed on the RStudio Workbench machine, the LabKey Server can be configured to connect to RStudio Workbench as described in Connect to RStudio Workbench. Key pieces of information from this setup must be communicated to the LabKey admin.

Environment Setup

AWS Notes

  1. If you're setting this server up in AWS, you will want to set it up in a public subnet in the same VPC as the destination server. It will need a public-facing IP/EIP.
  2. For AWS, the following ports need to be open, inbound and outbound, for this install to work in AWS. Set your security group/NACLs accordingly:
    • 80
    • 443
    • 1080 (license server)
    • 8787 (initial connection)
    • 1024-65535 (license server)

Linux and Ubuntu Setup

If your RStudio Workbench host machine is running an OS other than Linux/Ubuntu, you can configure a VM to provide a local Linux environment in which you can run Ubuntu. If your host is Linux running Ubuntu, you can skip this section.

Steps to set up Ubuntu on a non Linux host OS:

  1. VM: Use Virtual Box to set up a local Linux environment.
  2. Linux: Install Linux OS in that VM
  3. Ubuntu: Download and install latest Ubuntu; version 12.04 or higher is required) in the VM.
  4. Configure VM network so that VM has an IP and can be accessed by the host. For example: VM > Settings > Network > Attached To > Bridged Adapter

R Language Setup (for Ubuntu)

Use the following steps to set up R for Ubuntu. For other versions of linux, follow the instructions here instead: http://docs.rstudio.com/ide/server-pro/index.html#installation

1. Configure CRAN Repository. Add the following entry to your /etc/apt/sources.list file, substituting the correct name for your Ubuntu version where this example uses 'xenial' (the name for 16.x):

If you are not sure what name your Ubuntu version is using, you can find it in other lines of the sources.list file.

Linux Tip: to open a system file in linux for editing, use:
$ gksudo gedit  /etc/apt/sources.list
To add edit permission to a file:
$ chmod 777 /etc/apt/sources.list

2. Install latest version of R

apt-get update
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E084DAB9
sudo apt-get install r-base

3. Install Rlabkey and other R packages

  • If a library is to be used by RStudio as a system library (instead of a user library), run with 'lib' set as shown here:
    $ sudo R
    > install.packages("Rlabkey", lib="/usr/local/lib/R/site-library")
    > q()
  • If a library has dependencies:
    > install.packages("package", lib="/usr/local/lib/R/site-library", dependencies=TRUE)
    > q()
Trouble installing Rlabkey on Linux? Try installing dependencies explicitly:
> sudo apt-get install libssl-dev
> sudo apt-get install libcurl4-openssl-dev

Install RStudio Workbench

1. Download and install RStudio Workbench as instructed when you purchased it. For demo purposes, you can use the free 45 day trial license and download from the RStudio download page.

$ sudo apt-get install gdebi-core
wget https://download2.rstudio.org/rstudio-server-pro-1.1.447-amd64.deb
(nb: this URL may not be current. Check the RStudio download page for latest version)

gdebi rstudio-server-pro-1.1.447-amd64.deb (substitute current version)

2. Activation: (Optional)

$ sudo rstudio-server license-manager activate ACTIVATION_KEY

This function requires high ports to be world-accessible. If this command fails, check your security groups or NACLs; if this fails with a timeout, check them first.

3. Verify RStudio Workbench Installation

  1. Create a ubuntu user:
    useradd -m -s /bin/bash username
    passwd username
  2. Get Ubuntu’s ip address by:
    $ ifconfig
  3. In VM browser, go to localhost:8787, or ip:8787
  4. You should be on the RStudio Workbench login page, login using your ubuntu login
4. Verify accessing RStudio Workbench from the host machine

In the host machine’s browser, go to your VMs IP address. Depending on how you configured your license, it might look something like "http://ubuntuVmIPAddress:8787". You should see the RStudio login page and being able to log in.

Troubleshoot: did you configure VM to use Bridged Adapter?

5. Create a Route 53 DNS name for the server. You will need this for the next steps.

Configure RStudio Workbench Proxy

The RStudio Workbench administrator must enable proxy authentication on RStudio Workbench to use LabKey Server as the authenticator. The following steps will configure the proxy. See below for the details that must then be used when configuring LabKey Server to use this proxy.

1. You will need to create an SSL cert:

  1. openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 3650 -out certificate.pem
  2. Copy the key.pem file to /etc/ssl/private/
  3. Copy the certificate.pem file to /etc/ssl/certs
  4. Edit /etc/rstudio/rstudio-server.conf and add the following lines:
    ssl-enabled=1
    ssl-certificate=/etc/ssl/certs/cert.pem
    ssl-certificate-key=/etc/ssl/private/key.pem
  5. rstudio-server stop
  6. rstudio-server start
2. Use LabKey Server as proxy authenticator for RStudio Workbench by adding the following lines to /etc/rstudio/rserver.conf, supplying your own server name:
auth-proxy=1
auth-proxy-sign-in-url=https://YOUR_SERVER_NAME/rstudio-startPro.view?method=LAUNCH

3. Restart RStudio Workbench so that these configuration changes are effective by running

$sudo rstudio-server restart

4. Follow the instructions here for downloading the cert: https://stackoverflow.com/questions/21076179/pkix-path-building-failed-and-unable-to-find-valid-certification-path-to-requ

5. Upload the saved cert to the destination server

6. On the destination Labkey Server (nb: this will require restarting the LabKey instance):

  1. Log in.
  2. sudo su -
  3. keytool -import -keystore /usr/lib/jvm/jdk1.8.0_151/jre/lib/security/cacerts -alias <RStudio hostname, but not the FQDN> -file <whatever you named the saved cert>.crt (nb: you will be prompted for a password; ask Ops for it)
  4. service tomcat_lk restart
7. Verify Proxy redirect Go to the RStudio Workbench url to verify the page redirects to LabKey. Note: Once SSL is enabled, 8787 no longer works.

8. Customize Security Token

Define a custom security token for the proxy. This will be used to indicate the authenticated user when LabKey Server connects to RStudio Workbench. For example, to define the default 'X-RStudio-Username' token:

$sudo sh -c "echo 'X-RStudio-Username' > /etc/rstudio/secure-proxy-user-header"
$sudo chmod 0600 /etc/rstudio/secure-proxy-user-header

Information to Pass to LabKey Administrator

To configure LabKey Server to connect to RStudio Workbench, the admin will need:

(Optional) Add RStudio Workbench Users

LabKey users are mapped to RStudio users by removing the configured "email suffix". For example: LabKey user XYZ@email.com is mapped to RStudio user name XYZ

Adding RStudio Workbench Users

RStudio Workbench uses Linux’s PAM authentication so any users accounts configured for Linux will be able to access RStudio Workbench. Users can be added for RStudio by the adduser Linux command.

The following commands add a user “rstudio1” to Linux (and hence RStudio Workbench).

$ sudo useradd -m -d /home/rstudio1 rstudio1
$ sudo passwd rstudio1

IMPORTANT: Each user must have a home directory associated with them otherwise they won’t be able to access RStudio since RStudio works on this home directory.

Verify Access

Verify access to RStudio Workbench for the newly added user:

Go to ip:8787, enter user and pw for the newly created user, verify the user can be logged in to RStudio successfully.

Troubleshoot for permission denied error on login:

  1. Run pamtester to rule out PAM issue
    $ sudo pamtester --verbose rstudio <username> authenticate acct_mgmt
  2. Try adding the following line to /etc/sssd/sssd.conf (you may need to install sssd):
    ad_gpo_map_service = +rstudio
  3. Did you create the user with a specified home directory?
  4. Make sure you didn’t configure the RStudio Workbench server to limit user access by user group (the default is to allow all; check /etc/rstudio/rserver.conf), or if so make sure your user is added to an allowed group.

Troubleshooting

After making any change to the RStudio Workbench config, always run the following for the change to take effect:

$sudo rstudio-server restart

Related Topics




Edit R Reports in RStudio


Premium Feature — This feature supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

After LabKey is configured to support either RStudio or RStudio Workbench it can be enabled in any folder where users want to be able to edit R Reports using RStudio.

Note: To edit R reports in RStudio, you must have Rlabkey v2.2.2 (or higher).

Within LabKey, open the R Report Builder. There are several paths:

  1. Edit an existing R report (shown below).
  2. Create a new report from a grid by selecting (Charts/Reports) > Create R Report.
  3. From the Data Views web part, select Add Report > R Report.
  • Open the R Report Builder.
  • Click the Source tab and notice at the bottom there is an Edit in RStudio button.
  • When you click Edit in RStudio, you will see a popup giving the information that your report is now open for editing in RStudio. Click Go To RStudio to see it.

This may be in a hidden browser window or tab. This popup will remain in the current browser until you complete work in RStudio and click Edit in LabKey to return to the LabKey report editor.

Edit in RStudio

When you open a report to edit in RStudio, the editor interface will open. After making your changes, click the Save button (circled). Changes will be saved to LabKey.

Edit in RStudio Workbench

When you are using RStudio Workbench, if you have multiple sessions running, you will land on the home page. Select the session of interest and edit your report there.

Remember to save your report in RStudio Workbench before returning to the LabKey Server browser window and clicking Edit in LabKey.

Related Topics




Export Data to RStudio


Premium Feature — This feature supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

This topic describes how users can export data from grids for use in their RStudio or RStudio Workbench environment. For guidance in configuring your server, first complete one of the following:

Export to RStudio

On a data grid:
  • Select Export > RStudio. If you don't see this option, your server is not set up to use either RStudio or RStudio Workbench.
  • Enter a Variable Name and click Export to RStudio.
  • This will launch RStudio with your data loaded into the named variable.

Exporting to RStudio Workbench

When you are using RStudio Workbench and have multiple sessions running, when you click Export to RStudio you will first see your dashboard of open sessions. Select the session to which to export the data.

Related Topics




Advanced Initialization of RStudio


Premium Feature — This feature supports RStudio, which is part of the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey for details.

This topic describes the process for initializing RStudio to enable the viewing of the LabKey schema within RStudio or RStudio Workbench. By default this option is enabled.

Note: To use this feature, you must have the latest version of Rlabkey.

Configuration

Follow the steps for configuring LabKey to access your RStudio or RStudio Workbench instance following the instructions in one of these topics:

On the Configure RStudio page for either type of integration, scroll down and confirm that the box for Initialize Schema Viewer is checked.

Use the LabKey Schema within RStudio

The RStudio viewer will now allow you to browse the LabKey schema. Select a project from the menu circled in this screenshot, then select a folder to see the schema it contains, and a schema to see the queries it contains.

Create Shiny Apps

Once you have integrated either RStudio or RStudio Workbench with LabKey Server, you can create Shiny reports using LabKey data. This example shows an integrated Shiny app giving user control of a live visualization of viral load over time.

Extend Initialization

An RStudio system administrator can add or overwrite variables and settings during initialization by including an R function in the Rprofile.site file which calls the hidden/named function "labkey.rstudio.extend". When performing this initialization, a one time "requestId" is used, rather than a more persistent api key.

As an example, the following code introduces a simple call of this function which will print a 'Welcome' message (ignoring the self-signed certificate):

## define labkey.rstudio.extend function here, which will be called on RStudio R session initialization
# ## Example:
# labkey.rstudio.extend <- function() cat("nWelcome, LabKey user!nn")
library("Rlabkey")
labkey.acceptSelfSignedCerts()

RStudio executes the hidden function upon initialization:

Related Topics




SQL Queries


LabKey Server provides rich tools for working with databases and SQL queries. By developing SQL queries you can:
  • Create filtered grid views of data.
  • Join data from different tables.
  • Group data and compute aggregates for each group.
  • Add a calculated column to a query.
  • Format how data is displayed using query metadata.
  • Create staging tables for building reports.
Special features include:
  • An intuitive table-joining syntax called lookups. Lookups use a convenient syntax of the form "Table.ForeignKey.FieldFromForeignTable" to achieve what would normally require a JOIN in SQL. For details see LabKey SQL Reference.
  • Parameterized SQL statements. Pass parameters to a SQL query via a JavaScript API.
How are SQL Queries processed?: LabKey Server parses the query, does security checks, and basically "cross-compiles" the query into the native dialect before passing it on to the underlying database.

Topics

Examples

Related Topics




LabKey SQL Tutorial


SQL queries are a powerful way to customize how you view and use data. In LabKey Server, queries are the main way to surface data from the database: using a query, you pick the columns you want to retrieve, and optionally apply any filters and sorts. SQL Queries are added to the database schema alongside the original, core tables. You can query one table at a time, or create a query that combines data from multiple tables. This topic describes how to create and modify queries using the user interface: Queries also provide staging for reports: start with a base query and build a report on top of that query. Queries can be created through the graphical user interface (as shown in this topic) or through a file-based module.

SQL Resources

LabKey Server provides a number of mechanisms to simplify SQL syntax, covered in the topics linked here:

  • LabKey SQL: LabKey SQL is a SQL dialect that translates your queries into the native syntax of the SQL database underlying your server, whether it is PostgreSQL or Microsoft SQL Server. This lets you write in one SQL dialect but communicate with many SQL database implementations.
  • Lookups: Lookups join tables together using an intuitive syntax.
  • Query Metadata: Add additional properties to a query using metadata xml files, such as: column captions, relationships to other tables or queries, data formatting, and links.
The following step-by-step tutorial shows you how to create a SQL query and begin working with it.

Permissions: To create SQL queries within the user interface, you must have both the "Editor" role (or higher) and one of the developer roles "Platform Developer" or "Trusted Analyst".

Create a SQL Query

In this example, we will create a query based on the Users table in the core schema.

  • Select (Admin) > Go To Module > Query.
  • Open the core schema in the left hand pane and select the Users table.
  • Click the button Create New Query (in the top menu bar). This tells LabKey Server to create a query, and when you have selected a table first, by default it will be based on that table.
  • On the New Query page:
    • Provide a name for the query (in the field "What do you want to call the new query?"). Here we use "UsersQuery" though if that name is taken, you can use something else.
    • Confirm that the Users table is selected (in the field "Which query/table do you want this new query to be based on?")
    • Click Create and Edit Source.
  • LabKey Server will provide a "starter query" of the Users table: a basic SELECT statement for all of the fields in the table; essentially a duplicate of the original table, but giving you more editability. Typically, you would modify this "starter query" to fit your needs, changing the columns selected, adding WHERE clauses, JOINs to other tables, or substituting an entirely new SQL statement. For now, save the "starter query" unchanged.
  • Click Save & Finish.
  • The results of the query are displayed in a data grid, similar to the below, showing the users in your system.
  • Click core Schema to return to the query browser view of this schema.
  • Notice that your new query appears on the list alphabetically with an icon indicating it is a user-defined query.

Query Metadata

Like the original table, each query has accompanying XML that defines properties, or metadata, for the query. In this step we will add properties to the query by editing the accompanying XML. In particular we will:

  1. Change the data type of the UserId column, making it a lookup into the Users table. By showing a clickable name instead of an integer value, we can make this column more human-readable and simultaneously connect it to more useful information.
  2. Modify the way it is displayed in the grid. We will accomplish this by editing the XML directly.
  • Click Edit Metadata.
  • On the UserId row, click in the Data Type column where it shows the value Integer
  • From the dropdown, select User. This will create a lookup between your query and the Users table.
  • Click Apply.
  • Scroll down, click View Data, and click Yes, Save in the popup to confirm your changes.
  • Notice that the values in the User Id column are no longer integers, but text links with the user's display name shown instead of the previous integer value.
    • This reflects the fact that User Id is now a lookup into the Users table.
    • The display column for a lookup is (by default) the first text value in the target table. The actual integer value is unchanged. If you wish, you can expose the "User Id" column to check (use the grid customizer).
    • Click a value in the User Id column to see the corresponding record in the Users table.
This lookup is defined in the XML metadata document. Click back in your browser to return to the query, and let's see what the XML looks like.
  • Click core Schema to return to the Query Browser.
  • Click Edit Source and then select the XML Metadata tab.
  • The XML metadata will appear in a text editor. Notice the XML between the <fk>...</fk> tags. This tells LabKey Server to create a lookup (aka, a "foreign key") to the Users table in the core schema.
  • Next we will modify the XML directly to hide the "Display Name" column in our query. We don't need this column any longer because the User Id column already displays this information.
  • Add the following XML to the document, directly after the </column> tag (i.e directly before the ending </columns> tag). Each <column> tag in this section can customize a different column.
<column columnName="DisplayName">
<isHidden>true</isHidden>
</column>
  • Click the Data tab to see the results without leaving the query editor (or saving the change you made).
  • Notice that the Display Name column is no longer shown.
  • Click the XML Metadata tab and now add the following XML immediately after the column section for hiding the display name. This section will display the Email column values in red.
<column columnName="Email">
<conditionalFormats>
<conditionalFormat>
<filters>
<filter operator="isnonblank"/>
</filters>
<textColor>FF0000</textColor>
</conditionalFormat>
</conditionalFormats>
</column>
  • Click the Data tab to see the change.
  • Return to the Source tab to click Save & Finish.

Now that you have a SQL query, you can display it directly by using a query web part, or use it as the basis for a report, such as an R report or a visualization. It's important to note that nothing has changed about the underlying query (core.Users) on which your user-defined one (core.UsersQuery) is based. You could further define a new query based on the user-defined one as well. Each layer of query provides a new way to customize how your data is presented for different needs.

Related Topics




SQL Query Browser


A schema is a named collection of tables and queries. The Query Browser, or Schema Browser, is the dashboard for browsing all the database data in a LabKey Server folder. It also provides access to key schema-related functionality. Using the schema browser, users with sufficient permissions can:
  • Browse the tables and queries
  • Add new SQL queries
  • Discover table/query relationships to help write new queries
  • Define external schemas to access new data
  • Generate scripts for bulk insertion of data into a new schema

Topics


Premium Features Available

With the Professional and Enterprise Editions of LabKey Server, administrators have the ability to trace query dependencies within a container or across folders. Learn more in this topic:

Browse and Navigate the Schema

Authorized users can open the Query Schema Browser via one of these pathways:

  • (Admin) > Go To Module > Query
  • (Admin) > Developer Links > Schema Browser
The browser displays a list of the available schemas, represented with a folder icon, including external schemas and data sources you have added. Your permissions within the folder determine which tables and queries you can see here. Use the Show Hidden Schemas and Queries checkbox to see those that are hidden by default.

Schemas live in a particular container (project or folder) on LabKey Server, but can be marked as inheritable, in which case they are accessible in child folders. Learn more about controlling schema heritability in child folders in this topic:Query Metadata

Built-In Tables vs. Module-Defined Queries vs. User-Defined Queries

Click the name of a schema to see the queries and tables it contains. User-defined, module-defined, and built-in queries are listed alphabetically, each category represented by a different icon. Checkboxes in the lower left can be used to control the set shown, making it easier to find a desired item on a long list.

  • Show Hidden Schemas and Queries
  • Show User-Defined Queries
  • Show Module-Defined Queries
  • Show Built-In Tables

All Columns vs. Columns in the Default Grid View

You can browse columns by clicking on a particular table or query. The image below shows the column names of a user-defined query. In the Professional and Enterprise Editions of LabKey Server, you will also see a Dependency Report here.

For a particular table or query, the browser shows two separate lists of columns:

  • All columns in this table: the top section shows all of the columns in the table/query.
  • Columns in your default view of this query: Your default grid view of the table/query may not contain all of the columns present in the table. It also may contain additional "columns" when there are lookups (or foreign keys) defined in the default view. See Step 1: Customize Your Grid View to learn about changing the contents of the "default" view.
Below these lists for Built-In Tables, the Indices of the table are listed.

Validate Queries

When you upgrade to a new version of LabKey, change hardware, or database software, you may want to validate your SQL queries. You can perform a validation check of your SQL queries by pressing the Validate Queries button, on the top row of buttons in the Query Schema Browser.

Validation runs against all queries in the current folder and checks to see if the SQL queries parse and execute without errors.

  • To validate entire projects, initiate the scan from the project level and select Validate subfolders.
  • Check Include system queries for a much more comprehensive scan of internal system queries. Note that many warnings raised by this validation will be benign (such as queries against internal tables that are not exposed to end users).
  • Check Validate metadata and views box for a more thorough validation of both query metadata and saved views, including warnings of potentially invalid conditions, like autoincrement columns set userEditable=true, errors like invalid column names, or case errors which would cause problems for case-sensitive js. Again, many warnings raised in this scan will be benign.
Note: When running broad query validation and also using the experimental feature to Block Malicious Clients, you may trigger internal safeguards against "too many requests in a short period of time". If this occurs, the server may "lock you out" reporting a "Try again later" message in any browser. This block applies for an hour and can also be released if a site administrator turns off the experimental feature and/or clears the server cache.

Export Query Validation Results

Once query validation has been run, you can click Export to download the results, making it easier to parse and search.

The exported Excel file lists the container, type (Error or Warning), Query, and Discrepancy for each item in the list.

View Raw Schema and Table Metadata

Links to View Raw Schema Metadata are provided for each schema, including linked and external schemas, giving you an easy way to confirm expected datasource, scope, and tables.

Links to View Raw Table Metadata are provided for each built in table, giving details including indices, key details, and extended column properties.

Generate Schema Export / Migrate Data to Another Schema

If you wish to move data from one LabKey Server schema to another LabKey Server schema, you can do so by generating a migration script. The system will read the source schema and generate:

  1. A set of tab-separated value (TSV) files, one for each table in the source. Each TSV file is packaged as a .tsv.gz file.
  2. A script for importing these tables into a target schema. This script must be used as an update script included in a module.
Note that the script only copies data, it does not create the target schema itself. The target schema must already exist for the import script to work.

To generate the TSV files and the associated script:

  • Go the Schema Browser: (Admin) > Developer Links > Schema Browser.
  • Click Generate Schema Export.
  • Select the data source and schema name.
  • Enter a directory path where the script and TSV files will be written, for example: C:\temp\exports. This directory must already exist on your machine for the export to succeed.
  • Click Export.
  • The file artifacts will be written to the path you specified.

Field Descriptions

Field NameDescription
Source Data SourceThe data source where the data to export resides.
Source SchemaThe schema you want to export.
Target SchemaThe schema where you want to import the data.
Path in ScriptOptional. If you intend to place the import script and the data files in separate directories, specify a path so that the import script can find the data.
Output DirectoryDirectory on your local machine where the import script and the data files will be written. This directory must already exist on your machine, it will not be created for you.

The generated script consists of a series of bulkImport calls that open the .tsv.gz data files and insert them into the target schema, in this case the target schema is 'assaydata'.

SELECT core.bulkImport('assaydata', 'c17d97_pcr_data_fields', 'dbscripts/assaydata/c17d97_pcr_data_fields.tsv.gz');
SELECT core.bulkImport('assaydata', 'c15d80_rna_data_fields', 'dbscripts/assaydata/c15d80_rna_data_fields.tsv.gz');
SELECT core.bulkImport('assaydata', 'c2d326_test_data_fields', 'dbscripts/assaydata/c2d326_test_data_fields.tsv.gz');
SELECT core.bulkImport('assaydata', 'c15d77_pcr_data_fields', 'dbscripts/assaydata/c15d77_pcr_data_fields.tsv.gz');

Now you can re-import the data by adding the generated .sql script and .tsv.gz files to a module as a SQL upgrade script. For details on adding SQL scripts to modules, see Modules: SQL Scripts.

Related Topics




Create a SQL Query


Creating a custom SQL query gives you the ability to flexibly present the data in a table in any way you wish using SQL features like calculated columns, aggregation, formatting, filtering, joins, and lookups.

Permissions: To create a custom SQL query, you must have both the "Editor" role (or higher) and one of the developer roles "Platform Developer" or "Trusted Analyst".

The following steps guide you through creating a custom SQL query and view on a data table.

Create a Custom SQL Query

  • Select (Admin) > Go To Module > Query.
  • From the schema list, select the schema that includes your data table of interest.
  • Optionally select the table on which you want to base the new query.
  • Click Create New Query.
  • In the field What do you want to call the new query?, enter a name for your new query.
  • Select the base query/table under Which query/table do you want this new query to be based on?.
  • Click Create and Edit Source.
  • LabKey Server will generate a default SQL query for the selected table.
  • Table and column names including spaces must be quoted. For readability you can specify a 'nickname' for the table ("PE" shown above) and use it in the query.
  • Click the Data tab to see the results (so far just the original table).
  • Return to the Source tab and click Save & Finish to save your query.

Refine the source of this query as desired in the SQL source editor.

For example, you might calculate the average temperature per participant as shown here:

SELECT PE.ParticipantID, 
ROUND(AVG(PE.Temperature), 1) AS AverageTemp
FROM "Physical Exam" PE
GROUP BY PE.ParticipantID

Related Topics




Edit SQL Query Source


This topic describes how to edit the SQL source for an existing query.

Permissions: To edit a custom SQL query, you must have both the "Editor" role (or higher) and one of the developer roles "Platform Developer" or "Trusted Analyst".

  • Go to (Admin) > Go To Module > Query.
  • Using the navigation tree on the left, browse to the target query and then click Edit Source.
  • The SQL source appears in the source editor.
  • Select the Data tab or click Execute Query to see the results.
  • Return to the Source tab to make your desired changes.
  • Clicking Save will check for syntax errors. If any are found, correct them and click Save again.
  • Return to the Data tab or click Execute Query again to see the revised results.
  • Click Save & Finish to save your changes when complete.

Related Topics




LabKey SQL Reference


LabKey SQL 

LabKey SQL is a SQL dialect that supports (1) most standard SQL functionality and (2) provides extended functionality that is unique to LabKey, including:

  • Security. Before execution, all SQL queries are checked against the user's security roles/permissions.
  • Lookup columns. Lookup columns use an intuitive syntax to access data in other tables to achieve what would normally require a JOIN statement. For example: "SomeTable.ForeignKey.FieldFromForeignTable" For details see Lookups. The special lookup column "Datasets" is injected into each study dataset and provides a syntax shortcut when joining the current dataset to another dataset that has compatible join keys. See example usage.
  • Cross-folder querying. Queries can be scoped to folders broader than the current folder and can draw from tables in folders other than the current folder. See Cross-Folder Queries.
  • Parameterized SQL statements. The PARAMETERS keyword lets you define parameters for a query.  An associated API gives you control over the parameterized query from JavaScript code. See Parameterized SQL Queries.
  • Pivot tables. The PIVOT...BY expression provides a syntax for creating pivot tables. See Pivot Queries.
  • User-related functions. USERID() and ISMEMBEROF(groupid) let you control query visibility based on the user's group membership. 
  • Ontology-related functions. (Premium Feature) Access preferred terms and ontology concepts from SQL queries. See Ontology SQL.
  • Annotations. Override some column metadata using SQL annotations. See Use SQL Annotations.

Reference Tables:


Keywords

KeywordDescription
ASAliases can be explicitly named using the AS keyword. Note that the AS keyword is optional: the following select clauses both create an alias called "Name":

    SELECT LCASE(FirstName) AS Name
    SELECT LCASE(FirstName) Name

Implicit aliases are automatically generated for expressions in the SELECT list.  In the query below, an output column named "Expression1" is automatically created for the expression "LCASE(FirstName)":

    SELECT LCASE(FirstName)
    FROM PEOPLE

ASCENDING, ASCReturn results in ascending value order.

    ORDER BY Weight ASC 
CAST(AS)CAST(R.d AS VARCHAR)

Defined valid datatype keywords which can be used as cast/convert targets, and to what java.sql.Types name each keyword maps. Keywords are case-insensitive.

    BIGINT
    BINARY
    BIT
    CHAR
    DECIMAL
    DATE
    DOUBLE
    FLOAT
    GUID
    INTEGER
    LONGVARBINARY
    LONGVARCHAR
    NUMERIC
    REAL
    SMALLINT
    TIME
    TIMESTAMP
    TINYINT
    VARBINARY
    VARCHAR

Examples:

CAST(TimeCreated AS DATE)

CAST(WEEK(i.date) as INTEGER) as WeekOfYear,

DESCENDING, DESCReturn results in descending value order.
DISTINCTReturn distinct, non duplicate, values.

    SELECT DISTINCT Country
    FROM Demographics 
EXISTSReturns a Boolean value based on a subquery. Returns TRUE if at least one row is returned from the subquery.

The following example returns any plasma samples which have been assayed with a score greater than 80%. Assume that ImmuneScores.Data.SpecimenId is a lookup field (aka foreign key) to Plasma.Name. 

    SELECT Plasma.Name
    FROM Plasma
    WHERE EXISTS
        (SELECT *
        FROM assay.General.ImmuneScores.Data
        WHERE SpecimenId = Plasma.Name
        AND ScorePercent > .8)
FALSE 
FROMThe FROM clause in LabKey SQL must contain at least one table. It can also contain JOINs to other tables. Commas are supported in the FROM clause:

    FROM TableA, TableB
    WHERE TableA.x = TableB.x

Nested joins are supported in the FROM clause:

    FROM TableA LEFT JOIN (TableB INNER JOIN TableC ON ...) ON...

To refer to tables in LabKey folders other than the current folder, see Cross-Folder Queries.
GROUP BYUsed with aggregate functions to group the results.  Defines the "for each" or "per".  The example below returns the number of records "for each" participant:

    SELECT ParticipantId, COUNT(Created) "Number of Records"
    FROM "Physical Exam"
    GROUP BY ParticipantId

HAVINGUsed with aggregate functions to limit the results.  The following example returns participants with 10 or more records in the Physical Exam table:

    SELECT ParticipantId, COUNT(Created) "Number of Records"
    FROM "Physical Exam"
    GROUP BY ParticipantId
    HAVING COUNT(Created) > 10

HAVING can be used without a GROUP BY clause, in which case all selected rows are treated as a single group for aggregation purposes.
JOIN,
RIGHT JOIN,
LEFT JOIN,
FULL JOIN,
CROSS JOIN
Example:

    SELECT *
    FROM "Physical Exam"
    FULL JOIN "Lab Results"
    ON "Physical Exam".ParticipantId = "Lab Results".ParticipantId 
LIMITLimits the number or records returned by the query.  The following example returns the 10 most recent records:

    SELECT *
    FROM "Physical Exam"
    ORDER BY Created DESC LIMIT 10

NULLIF(A,B) Returns NULL if A=B, otherwise returns A.
ORDER BY One option for sorting query results; may produce unexpected results when dataregions or views also have sorting applied. If you can instead use a sort on the custom view or via, the API, those methods are preferred (see below). Use ORDER BY with LIMIT to improve performance:

    SELECT ParticipantID,
    Height_cm AS Height
    FROM "Physical Exam"
    ORDER BY Height DESC LIMIT 5

For best ORDER BY results, be sure to a) SELECT the columns on which you are sorting, and b) if sorting on an expression include the expression in the SELECT and sort by the alias of the expression. For example:

    SELECT A, B, A+B AS C @hidden ... ORDER BY C
is preferable to:
    SELECT A, B ... ORDER BY A+B

Troubleshooting: "Why is the ORDER BY clause not working as expected?"

When authoring queries in LabKey SQL, the query is typically processed as a subquery within a parent query. This parent query may apply it's own sorting overriding the ORDER BY clause in the subquery. This parent "view layer" provides default behavior like pagination, lookups, etc. but may also inadvertantly apply an unexpected additional sort.

Two recommended solutions for predictable sorting:
(1) Define the sort in the parent query using the grid view customizer. This may involve adding a new named view of that query to use as your parent query.
(2) Use the "sort" property in the selectRows API call.

PARAMETERSQueries can declare parameters using the PARAMETERS keyword. Default values data types are supported as shown below:

    PARAMETERS (X INTEGER DEFAULT 37)
    SELECT *
    FROM "Physical Exam"
    WHERE Temp_C = X

Parameter names will override any unqualified table column with the same name.  Use a table qualification to disambiguate.  In the example below, R.X refers to the column while X refers to the parameter:

    PARAMETERS(X INTEGER DEFAULT 5)
    SELECT *
    FROM Table R
    WHERE R.X = X

Supported data types for parameters are: BIGINT, BIT, CHAR, DECIMAL, DOUBLE, FLOAT, INTEGER, LONGVARCHAR, NUMERIC, REAL, SMALLINT, TIMESTAMP, TINYINT, VARCHAR

Parameter values can be passed via JavaScript API calls to the query. For details see Parameterized SQL Queries.

PIVOT BYRe-visualize a table by rotating or "pivoting" a portion of it, essentially promoting cell data to column headers. See Pivot Queries for examples. Note that Pivot queries cannot be used with compliance logging. For details, see Compliance: Logging.
SELECTSELECT queries are the only type of query that can currently be written in LabKey SQL.  Sub-selects are allowed both as an expression, and in the FROM clause.

Aliases are automatically generated for expressions after SELECT.  In the query below, an output column named "Expression1" is automatically generated for the expression "LCASE(FirstName)":

    SELECT LCASE(FirstName) FROM...

TRUE 
UNION, UNION ALLThe UNION clause is the same as standard SQL.  LabKey SQL supports UNION in subqueries. 
VALUES ... ASA subset of VALUES syntax is supported. Generate a "constant table" by providing a parenthesized list of expressions for each row in the table. The lists must all have the same number of elements and corresponding entries must have compatible data types. For example:

VALUES (1, 'one'), (2, 'two'), (3, 'three') AS t;

You must provide the alias for the result ("AS t" in the above), aliasing column names is not supported. The column names will be 'column1', 'column2', etc.
WHEREFilter the results for certain values. Example:

    SELECT *
    FROM "Physical Exam"
    WHERE YEAR(Date) = 2010 
WITH

Define a "common table expression" which functions like a subquery or inline view table. Especially useful for recursive queries.

Usage Notes: If there are UNION clauses that do not reference the common table expression (CTE) itself, the server interprets them as normal UNIONs. The first subclause of a UNION may not reference the CTE. The CTE may only be referenced once in a FROM clause or JOIN clauses within the UNION. There may be multiple CTEs defined in the WITH. Each may reference the previous CTEs in the WITH. No column specifications are allowed in the WITH (as some SQL versions allow).

Exception Behavior: Testing indicates that PostgreSQL does not provide an exception to LabKey Server for a non-ending, recursive CTE query. This can cause the LabKey Server to wait indefinitely for the query to complete. MS SQL Server does provide an exception to the server which allows the LabKey Server to end the query attempt. 

A non-recursive example:

WITH AllDemo AS
  (
     SELECT *
     FROM "/Studies/Study A/".study.Demographics
     UNION
     SELECT *
     FROM "/Studies/Study B/".study.Demographics
  )
SELECT ParticipantId from AllDemo

A recursive example: In a table that holds parent/child information, this query returns all of the children and grandchildren (recursively down the generations), for a given "Source" parent.

PARAMETERS
(
    Source VARCHAR DEFAULT NULL
)

WITH Derivations AS 
( 
   -- Anchor Query. User enters a 'Source' parent 
   SELECT Item, Parent
   FROM Items 
   WHERE Parent = Source
   UNION ALL 

   -- Recursive Query. Get the children, grandchildren, ... of the source parent
   SELECT i.Item, i.Parent 
   FROM Items i INNER JOIN Derivations p  
   ON i.Parent = p.Item 
) 
SELECT * FROM Derivations

 

Constants

The following constant values can be used in LabKey SQL queries.

ConstantDescription
CAST('Infinity' AS DOUBLE) Represents positive infinity.
CAST('-Infinity' AS DOUBLE) Represents negative infinity.
CAST('NaN' AS DOUBLE) Represents "Not a number".
TRUE Boolean value.
FALSE Boolean value.


Operators

OperatorDescription
String OperatorsNote that strings are delimited with single quotes. Double quotes are used for column and table names containing spaces.
 ||String concatenation. For example:    
   
    SELECT ParticipantId,
    City || ', ' || State AS CityOfOrigin
    FROM Demographics

If any argument is null, the || operator will return a null string. To handle this, use COALESCE with an empty string as it's second argument, so that the other || arguments will be returned:
    City || ', ' || COALESCE (State, '')
 LIKEPattern matching. The entire string must match the given pattern. Ex: LIKE 'W%'.
 NOT LIKENegative pattern matching. Will return values that do not match a given pattern. Ex: NOT LIKE 'W%'
Arithmetic Operators 
 +Add
 -Subtract
 *Multiply
 /Divide
Comparison operators 
 =Equals
 !=Does not equal
 <> Does not equal
 > Is greater than
 < Is less than
 >=Is greater than or equal to
 <=Is less than or equal to
 IS NULLIs NULL
 IS NOT NULLIs NOT NULL
 BETWEENBetween two values. Values can be numbers, strings or dates.
 INExample: WHERE City IN ('Seattle', 'Portland')
 NOT INExample: WHERE City NOT IN ('Seattle', 'Portland')
Bitwise Operators 
 &Bitwise AND
 |Bitwise OR
 ^Bitwise exclusive OR
Logical Operators 
 ANDLogical AND
 ORLogical OR
 NOTExample: WHERE NOT Country='USA'

Operator Order of Precedence

Order of PrecedenceOperators
 1 - (unary) , + (unary), CASE
 2 *, / (multiplication, division) 
 3 +, -, & (binary plus, binary minus)
 4 & (bitwise and)
 5 ^ (bitwise xor)
 6 | (bitwise or)
 7 || (concatenation)
 8 <, >, <=, >=, IN, NOT IN, BETWEEN, NOT BETWEEN, LIKE, NOT LIKE 
 9 =, IS, IS NOT, <>, !=  
10 NOT
11 AND
12 OR

Aggregate Functions - General

FunctionDescription
COUNTThe special syntax COUNT(*) is supported as of LabKey v9.2.
MINMinimum
MAXMaximum
AVGAverage
SUMSum 
GROUP_CONCATAn aggregate function, much like MAX, MIN, AVG, COUNT, etc. It can be used wherever the standard aggregate functions can be used, and is subject to the same grouping rules. It will return a string value which is comma-separated list of all of the values for that grouping. A custom separator, instead of the default comma, can be specified. Learn more here.
The example below specifies a semi-colon as the separator:

    SELECT Participant, GROUP_CONCAT(DISTINCT Category, ';') AS CATEGORIES FROM SomeSchema.SomeTable

To use a line-break as the separator, use the following:

    SELECT Participant, GROUP_CONCAT(DISTINCT Category, chr(10)) AS CATEGORIES FROM SomeSchema.SomeTable  
stddev(expression)Standard deviation
stddev_pop(expression)Population standard deviation of the input values.
variance(expression)Historical alias for var_samp.
var_pop(expression)Population variance of the input values (square of the population standard deviation).
median(expression)The 50th percentile of the values submitted.

Aggregate Functions - PostgreSQL Only

FunctionDescription
bool_and(expression)Aggregates boolean values. Returns true if all values are true and false if any are false.
bool_or(expression)Aggregates boolean values. Returns true if any values are true and false if all are false.
bit_and(expression)Returns the bitwise AND of all non-null input values, or null if none.
bit_or(expression)Returns the bitwise OR of all non-null input values, or null if none.
every(expression)Equivalent to bool_and(). Returns true if all values are true and false if any are false.
corr(Y,X)Correlation coefficient.
covar_pop(Y,X)Population covariance.
covar_samp(Y,X)Sample covariance.
regr_avgx(Y,X)Average of the independent variable: (SUM(X)/N).
regr_avgy(Y,X)Average of the dependent variable: (SUM(Y)/N).
regr_count(Y,X)Number of non-null input rows.
regr_intercept(Y,X)Y-intercept of the least-squares-fit linear equation determined by the (X,Y) pairs.
regr_r2(Y,X)Square of the correlation coefficient.
regr_slope(Y,X)Slope of the least-squares-fit linear equation determined by the (X,Y) pairs.
regr_sxx(Y,X)Sum of squares of the independent variable.
regr_sxy(Y,X)Sum of products of independent times dependent variable.
regr_syy(Y,X)Sum of squares of the dependent variable.
stddev_samp(expression)Sample standard deviation of the input values.
var_samp(expression)Sample variance of the input values (square of the sample standard deviation).

SQL Functions - General

Many of these functions are similar to standard SQL functions -- see the JBDC escape syntax documentation for additional information.

FunctionDescription
abs(value)Returns the absolute value.
acos(value)Returns the arc cosine.
age(date1, date2)

Supplies the difference in age between the two dates, calculated in years.

age(date1, date2, interval)

The interval indicates the unit of age measurement, either SQL_TSI_MONTH or SQL_TSI_YEAR.

age_in_months(date1, date2)Behavior is undefined if date2 is before date1.
age_in_years(date1, date2)Behavior is undefined if date2 is before date1.
asin(value)Returns the arc sine.
atan(value)Returns the arc tangent.
atan2(value1, value2)Returns the arctangent of the quotient of two values.
case

LabKey SQL parser sometimes requires the use of additional parentheses within the statement.

    CASE (value) WHEN (test1) THEN (result1) ELSE (result2) END
    CASE WHEN (test1) THEN (result1) ELSE (result2) END

Example:

    SELECT "StudentName",
    School,
    CASE WHEN (Division = 'Grades 3-5') THEN (Scores.Score*1.13) ELSE Score END AS AdjustedScore,
    Division
    FROM Scores 

ceiling(value)Rounds the value up.
coalesce(value1,...,valueN)Returns the first non-null value in the argument list. Use to set default values for display.
concat(value1,value2)Concatenates two values. 
contextPath()Returns the context path starting with “/” (e.g. “/labkey”). Returns the empty string if there is no current context path. (Returns VARCHAR.)
cos(radians)Returns the cosine.
cot(radians)Returns the cotangent.
curdate()Returns the current date.
curtime()Returns the current time
dayofmonth(date)Returns the day of the month (1-31) for a given date.
dayofweek(date)Returns the day of the week (1-7) for a given date. (Sun=1 and Sat=7)
dayofyear(date)Returns the day of the year (1-365) for a given date.
degrees(radians)Returns degrees based on the given radians.
exp(n)Returns Euler's number e raised to the nth power. e = 2.71828183 
floor(value)Rounds down to the nearest integer.
folderName()LabKey SQL extension function. Returns the name of the current folder, without beginning or trailing "/". (Returns VARCHAR.)
folderPath()LabKey SQL extension function. Returns the current folder path (starts with “/”, but does not end with “/”). The root returns “/”. (Returns VARCHAR.)
greatest(a, b, c, ...)Returns the greatest value from the list expressions provided. Any number of expressions may be used. The expressions must have the same data type, which will also be the type of the result. The LEAST() function is similar, but returns the smallest value from the list of expressions. GREATEST() and LEAST() are not implemented for SAS databases.

When NULL values appear in the list of expressions, different database implementations as follows:

- PostgreSQL & MS SQL Server ignore NULL values in the arguments, only returning NULL if all arguments are NULL.
- Oracle and MySql return NULL if any one of the arguments are NULL. Best practice: wrap any potentially nullable arguments in coalesce() or ifnull() and determine at the time of usage if NULL should be treated as high or low.

Example:

SELECT greatest(score_1, score_2, score_3) As HIGH_SCORE
FROM MyAssay 

hour(time)Returns the hour for a given date/time.
ifdefined(column_name)IFDEFINED(NAME) allows queries to reference columns that may not be present on a table. Without using IFDEFINED(), LabKey will raise a SQL parse error if the column cannot be resolved. Using IFDEFINED(), a column that cannot be resolved is treated as a NULL value. The IFDEFINED() syntax is useful for writing queries over PIVOT queries or assay tables where columns may be added or removed by an administrator.
ifnull(testValue, defaultValue)If testValue is null, returns the defaultValue.  Example: IFNULL(Units,0)
isequalLabKey SQL extension function. ISEQUAL(a,b) is equivalent to (a=b OR (a IS NULL AND b IS NULL))
ismemberof(groupid)LabKey SQL extension function. Returns true if the current user is a member of the specified group.
javaConstant(fieldName)LabKey SQL extension function. Provides access to public static final variable values.  For details see LabKey SQL Utility Functions.
lcase(string)Convert all characters of a string to lower case.
least(a, b, c, ...)Returns the smallest value from the list expressions provided. For more details, see greatest() above.
left(string, integer)Returns the left side of the string, to the given number of characters. Example: SELECT LEFT('STRINGVALUE',3) returns 'STR'
length(string)Returns the length of the given string.
locate(substring, string) locate(substring, string, startIndex)Returns the location of the first occurrence of substring within string.  startIndex provides a starting position to begin the search. 
log(n)Returns the natural logarithm of n.
log10(n)Base base 10 logarithm on n.
lower(string)Convert all characters of a string to lower case.
ltrim(string)Trims white space characters from the left side of the string. For example: LTRIM('     Trim String')
minute(time)Returns the minute value for the given time. 
mod(dividend, divider)Returns the remainder of the division of dividend by divider.
moduleProperty(module name,  property name)

LabKey SQL extension function. Returns a module property, based on the module and property names. For details see LabKey SQL Utility Functions

month(date)Returns the month value (1-12) of the given date. 
monthname(date)Return the month name of the given date.
now()Returns the system date and time.
overlapsLabKey SQL extension function. Supported only when PostgreSQL is installed as the primary database.    
   
    SELECT OVERLAPS (START1, END1, START2, END2) AS COLUMN1 FROM MYTABLE

The LabKey SQL syntax above is translated into the following PostgreSQL syntax:    
   
    SELECT (START1, END1) OVERLAPS (START2, END2) AS COLUMN1 FROM MYTABLE
pi()Returns the value of π.
power(base, exponent)Returns the base raised to the power of the exponent. For example, power(10,2) returns 100.
quarter(date)Returns the yearly quarter for the given date where the 1st quarter = Jan 1-Mar 31, 2nd quarter = Apr 1-Jun 30, 3rd quarter = Jul 1-Sep 30, 4th quarter = Oct 1-Dec 31
radians(degrees)Returns the radians for the given degrees.
rand(), rand(seed)Returns a random number between 0 and 1.
repeat(string, count)Returns the string repeated the given number of times. SELECT REPEAT('Hello',2) returns 'HelloHello'.
round(value, precision)Rounds the value to the specified number of decimal places. ROUND(43.3432,2) returns 43.34
rtrim(string)Trims white space characters from the right side of the string. For example: RTRIM('Trim String     ')
second(time)Returns the second value for the given time.
sign(value)Returns the sign, positive or negative, for the given value. 
sin(value)Returns the sine for the given value.
startswith(string, prefix)Tests to see if the string starts with the specified prefix. For example, STARTSWITH('12345','2') returns FALSE.
sqrt(value)Returns the square root of the value.
substring(string, start, length)Returns a portion of the string as specified by the start location (1-based) and length (number of characters). For example, substring('SomeString', 1,2) returns the string 'So'.
tan(value)

Returns the tangent of the value.

timestampadd(interval, number_to_add, timestamp)

Adds an interval to the given timestamp value. The interval value must be surrounded by quotes. Possible values for interval:

SQL_TSI_FRAC_SECOND
SQL_TSI_SECOND
SQL_TSI_MINUTE
SQL_TSI_HOUR
SQL_TSI_DAY
SQL_TSI_WEEK
SQL_TSI_MONTH
SQL_TSI_QUARTER
SQL_TSI_YEAR

Example: TIMESTAMPADD('SQL_TSI_QUARTER', 1, "Physical Exam".date) AS NextExam

timestampdiff(interval, timestamp1, timestamp2)

Finds the difference between two timestamp values at a specified interval. The interval must be surrounded by quotes.

Example: TIMESTAMPDIFF('SQL_TSI_DAY', SpecimenEvent.StorageDate, SpecimenEvent.ShipDate)

Note that PostgreSQL does not support the following intervals:

SQL_TSI_FRAC_SECOND
SQL_TSI_WEEK
SQL_TSI_MONTH
SQL_TSI_QUARTER
SQL_TSI_YEAR

As a workaround, use the 'age' functions defined above.

truncate(numeric value, precision)Truncates the numeric value to the precision specified. This is an arithmetic truncation, not a string truncation.
  TRUNCATE(123.4567,1) returns 123.4
  TRUNCATE(123.4567,2) returns 123.45
  TRUNCATE(123.4567,-1) returns 120.0

May require an explict CAST into NUMERIC, as LabKey SQL does not check data types for function arguments.

SELECT
PhysicalExam.Temperature,
TRUNCATE(CAST(Temperature AS NUMERIC),1) as truncTemperature
FROM PhysicalExam
ucase(string), upper(string)Converts all characters to upper case.
userid()LabKey SQL extension function. Returns the userid, an integer, of the logged in user. 
username()LabKey SQL extension function. Returns the current user display name. VARCHAR
version()LabKey SQL extension function. Returns the current schema version of the core module as a numeric with four decimal places. For example: 20.0070
week(date)Returns the week value (1-52) of the given date.
year(date)Return the year of the given date.  Assuming the system date is March 4 2023, then YEAR(NOW()) return 2023.

SQL Functions - PostgreSQL Specific

LabKey SQL supports the following PostgreSQL functions.
See the  PostgreSQL docs for usage details.

 PostgreSQL Function  Docs 
 ascii(value)Returns the ASCII code of the first character of value.   
 btrim(value,
  trimchars)
Removes characters in trimchars from the start and end of string. trimchars defaults to white space.

BTRIM(' trim    ') returns TRIM 
BTRIM('abbatrimabtrimabba', 'ab') returns trimabtrim

 character_length(value),  char_length(value)

Returns the number of characters in value.
 chr(integer_code)Returns the character with the given integer_code.  

CHR(70) returns F
 concat_ws(sep text,
val1 "any" [, val2 "any" [,...]]) -> text
Concatenates all but the first argument, with separators. The first argument is used as the separator string, and should not be NULL. Other NULL arguments are ignored. See the PostgreSQL docs.

concat_ws(',', 'abcde', 2, NULL, 22) → abcde,2,22
 decode(text,
  format)
See the PostgreSQL docs.
 encode(binary,
  format)
See the PostgreSQL docs.
 is_distinct_from(a, b) OR
 is_not_distinct_from(a, b)
Not equal (or equal), treating null like an ordinary value. 
 initcap(string)Converts the first character of each separate word in string to uppercase and the rest to lowercase. 
 lpad(string, 
  int,
  fillchars)
Pads string to length int by prepending characters fillchars
 md5(text)Returns the hex MD5 value of text
 octet_length(string) Returns the number of bytes in string.
 overlapsSee above for syntax details.
 quote_ident(string)Returns string quoted for use as an identifier in an SQL statement. 
 quote_literal(string)Returns string quoted for use as a string literal in an SQL statement.
 regexp_replace See PostgreSQL docs for details: reference doc, example doc
 repeat(string, int)Repeats string the specified number of times.
 replace(string, 
  matchString, 
  replaceString)
Searches string for matchString and replaces occurrences with replaceString.
 rpad(string, 
  int,
  fillchars)
Pads string to length int by postpending characters fillchars.
 similar_to(A,B,C)String pattern matching using SQL regular expressions. 'A' similar to 'B' escape 'C'. See the PostgreSQL docs.
 split_part(string,
  delmiter,
  int)
Splits string on delimiter and returns fragment number int (starting the count from 1).

SPLIT_PART('mississippi', 'i', 4) returns 'pp'.
 string_to_arraySee Array Functions in the PostgreSQL docs.
 strpos(string,
  substring)
Returns the position of substring in string. (Count starts from 1.)
 substr(string,
 fromPosition,
 charCount)

Extracts the number of characters specified by charCount from string starting at position fromPosition.

SUBSTR('char_sequence', 5, 2) returns '_s' 

 to_ascii(string,
  encoding) 
Convert string to ASCII from another encoding.
 to_hex(int)Converts int to its hex representation.
 translate(text,
  fromText,
  toText) 
Characters in string matching a character in the fromString set are replaced by the corresponding character in toString.
 to_charSee Data Type Formatting Functions in the PostgreSQL docs.
 to_date(textdate,
  format) 
See Data Type Formatting Functions in the PostgreSQL docs. 
 to_timestampSee Data Type Formatting Functions in the PostgreSQL docs.
 to_numberSee Data Type Formatting Functions in the PostgreSQL docs.
 unnestSee Array Functions in the PostgreSQL docs.

JSON and JSONB Operators and Functions

LabKey SQL supports the following PostgreSQL JSON and JSONB operators and functions. Note that LabKey SQL does not natively understand arrays and some other features, but it may still be possible to use the functions that expect them.
See the PostgreSQL docs for usage details.

PostgreSQL Operators and FunctionsDocs
->, ->>, #>, #>>, @>, <@, ?, ?|, ?&, ||, -, #-
LabKey SQL supports these operators via a pass-through function, json_op. The function's first argument is the operator's first operand. The first second is the operator, passed as a string constant. The function's third argument is the second operand. For example, this Postgres SQL expression:
a_jsonb_column --> 2
can be represented in LabKey SQL as:
json_op(a_jsonb_column, '-->', 2)
parse_json, parse_jsonbCasts a text value to a parsed JSON or JSONB data type. For example,
'{"a":1, "b":null}'::jsonb
or
CAST('{"a":1, "b":null}' AS JSONB)
can be represented in LabKey SQL as:
parse_jsonb('{"a":1, "b":null}')
to_json, to_jsonbConverts a value to the JSON or JSONB data type. Will treat a text value as a single JSON string value
array_to_jsonConverts an array value to the JSON data type.
row_to_jsonConverts a scalar (simple value) row to JSON. Note that LabKey SQL does not support the version of this function that will convert an entire table to JSON. Consider using "to_jsonb()" instead.
json_build_array, jsonb_build_arrayBuild a JSON array from the arguments
json_build_object, jsonb_build_objectBuild a JSON object from the arguments
json_object, jsonb_objectBuild a JSON object from a text array
json_array_length, jsonb_array_lengthReturn the length of the outermost JSON array
json_each, jsonb_eachExpand the outermost JSON object into key/value pairs. Note that LabKey SQL does not support the table version of this function. Usage as a scalar function like this is supported:
SELECT json_each('{"a":"foo", "b":"bar"}') AS Value
json_each_text, jsonb_each_textExpand the outermost JSON object into key/value pairs into text. Note that LabKey SQL does not support the table version of this function. Usage as a scalar function (similar to json_each) is supported.
json_extract_path, jsonb_extract_pathReturn the JSON value referenced by the path
json_extract_path_text, jsonb_extract_path_textReturn the JSON value referenced by the path as text
json_object_keys, jsonb_object_keysReturn the keys of the outermost JSON object
json_array_elements, jsonb_array_elementsExpand a JSON array into a set of values
json_array_elements_text, jsonb_array_elements_textExpand a JSON array into a set of text values
json_typeof, jsonb_typeofReturn the type of the outermost JSON object
json_strip_nulls, jsonb_strip_nullsRemove all null values from a JSON object
jsonb_insertInsert a value within a JSON object at a given path
jsonb_prettyFormat a JSON object as indented text
jsonb_setSet the value within a JSON object for a given path. Strict, i.e. returns NULL on NULL input.
jsonb_set_laxSet the value within a JSON object for a given path. Not strict; expects third argument to specify how to treat NULL input (one of 'raise_exception', 'use_json_null', 'delete_key', or 'return_target').
jsonb_path_exists, jsonb_path_exists_tzChecks whether the JSON path returns any item for the specified JSON value. The "_tz" variant is timezone aware.
jsonb_path_match, jsonb_path_match_tzReturns the result of a JSON path predicate check for the specified JSON value. The "_tz" variant is timezone aware.
jsonb_path_query, jsonb_path_query_tzReturns all JSON items returned by the JSON path for the specified JSON value. The "_tz" variant is timezone aware.
jsonb_path_query_array, jsonb_path_query_array_tzReturns as an array, all JSON items returned by the JSON path for the specified JSON value. The "_tz" variant is timezone aware.
jsonb_path_query_first, jsonb_path_query_first_tzReturns the first JSON item returned by the JSON path for the specified JSON value. The "_tz" variant is timezone aware.

SQL Functions - MS SQL Server Specific

LabKey SQL supports the following SQL Server functions.
See the SQL Server docs for usage details.

MS SQL Server FunctionDescription
ascii(value)Returns the ASCII code of the first character of value.  
char(int), chr(int)Returns an character for the specified ascii code int
charindex(string, 
 searchString,
 index) 
Returns the position of searchString in string, starting the search at index.
concat_ws(sep text,
val1 "any" [, val2 "any" [,...]]) -> text
Concatenates all but the first argument, with separators. The first argument is used as the separator string, and should not be NULL. Other NULL arguments are ignored.

concat_ws(',', 'abcde', 2, NULL, 22) → abcde,2,22
difference(string,string)Returns the difference between the soundex values of two expressions as an integer.
See the MS SQL docs.
isnumeric(expression)Determines whether an expression is a valid numeric type. See the MS SQL docs.
len(string)Returns the number of characters in string. Trailing white space is excluded. 
patindex(pattern,string)Returns the position of the first occurrence of pattern in string. See the MS SQL docs
quotenameSee the MS SQL docs.

replace(string,pattern, replacement)

Replaces all occurences of pattern with replacement in the string provided. See the MS SQL docs.
replicate(string,int)Replicate string the specified number of times.
reverse(string)Returns the string in reverse character sequence.
right(string,index)Returns the right part of string to the specified index.
soundexSee the MS SQL docs.
space(int)Returns a string of white space characters.
str(float,length,decimal)See the MS SQL docs
stuff(string,
 start,
 length,
 replaceWith)
Inserts replaceWith into string. Deletes the specified length of characters in string at the start position and then inserts replaceWith. See the MS SQL docs.

General Syntax

Syntax ItemDescription
Case SensitivitySchema names, table names, column names, SQL keywords, function names are case-insensitive in LabKey SQL.
Comments Comments that use the standard SQL syntax can be included in queries. '--' starts a line comment. Also, '/* */' can surround a comment block:

-- line comment 1
-- line comment 2
/* block comment 1
    block comment 2 */
SELECT ... 

IdentifiersIdentifiers in LabKey SQL may be quoted using double quotes. (Double quotes within an identifier are escaped with a second double quote.) 

SELECT "Physical Exam".*
... 
LookupsLookups columns reference data in other tables.  In SQL terms, they are foreign key columns.  See Lookups for details on creating lookup columns. Lookups use a convenient syntax of the form "Table.ForeignKey.FieldFromForeignTable" to achieve what would normally require a JOIN in SQL. Example:

Issues.AssignedTo.DisplayName

String LiteralsString literals are quoted with single quotes ('). Within a single quoted string, a single quote is escaped with another single quote.

   SELECT * FROM TableName WHERE FieldName = 'Jim''s Item' 

Date/Time Literals

Date and Timestamp (Date&Time) literals can be specified using the JDBC escape syntax

{ts '2001-02-03 04:05:06'}

{d '2001-02-03'}

Related Topics




Lookups: SQL Syntax


Lookups are an intuitive table linking syntax provided to simplify data integration and SQL queries. They represent foreign key relationships between tables, and once established, can be used to "expose" columns from the "target" of the lookup in the source table or query.

All lookups are secure: before execution, all references in a query are checked against the user's security role/permissions, including lookup target tables.

This topic describes using lookups from SQL. Learn about defining lookups via the user interface in this topic: Lookup Columns.

Lookup SQL Syntax

Lookups have the general form:

Table.ForeignKey.FieldFromForeignTable

Examples

A lookup effectively creates a join between the two tables. For instance if "tableA" has a lookup to "tableB", and "tableB" has a column called "name" you can use this syntax. The "." in the "tableB.name" does a left outer join for you.

SELECT *
FROM tableA
WHERE tableB.name = 'A01'

Datasets Lookup

The following query uses the "Datasets" special column that is present in every study dataset table to lookup values in the Demographics table (another dataset in the same study), joining them to the PhysicalExam table.

SELECT PhysicalExam.ParticipantId,
PhysicalExam.date,
PhysicalExam.Height_cm,
Datasets.Demographics.Gender AS GenderLookup,
FROM PhysicalExam

It replaces the following JOIN statement.

SELECT PhysicalExam.ParticipantId,
PhysicalExam.date,
PhysicalExam.Height_cm,
Demographics.Gender AS GenderLookup
FROM PhysicalExam
INNER JOIN Demographics ON PhysicalExam.ParticipantId = Demographics.ParticipantId

Lookup From a List

The following expressions show the Demographics table looking up values in the Languages table.

SELECT Demographics.ParticipantId, 
Demographics.StartDate,
Demographics.Language.LanguageName,
Demographics.Language.TranslatorName,
Demographics.Language.TranslatorPhone
FROM Demographics

It replaces the following JOIN statement.

SELECT Demographics.ParticipantId, 
Demographics.StartDate,
Languages.LanguageName,
Languages.TranslatorName,
Languages.TranslatorPhone
FROM Demographics LEFT OUTER JOIN lists.Languages
ON Demographics.Language = Languages.LanguageId;

Lookup User Last Name

The following lookup expression shows the Issues table looking up data in the Users table, retrieving the Last Name.

Issues.UserID.LastName

Discover Lookup Column Names

To discover lookup relationships between tables:

  • Go to (Admin) > Developer Links > Schema Browser.
  • Select a schema and table of interest.
  • Browse lookup fields by clicking the icon next to a column name which has a lookup table listed.
  • In the image below, the column study.Demographics.Language looks up the lists.Languages table joining on the column LanguageId.
  • Available columns in the Languages table are listed (in the red box). To reference these columns in a SQL query, use the lookup syntax: Demographics.Language."col_in_lookup_table", i.e. Demographics.Language.TranslatorName, Demographics.Language.TranslatorPhone, etc.

Note that the values are shown using the slash-delimited syntax, which is used in the selectRows API. API documentation is available here:

Adding Lookups to Table/List Definitions

Before lookup columns can be used, they need to be added to the definition of a dataset/list/query. The process of setting up lookup relationships in the field editor is described here: Lookup Columns.

Creating a Lookup to a Non-Primary-Key Field

When you select a schema, all tables with a primary key of the type matching the field you are defining are listed by default. If you want to create a lookup to a column that is not the primary key, you must first be certain that it is unique (i.e. could be a key), then take an additional step to mark the desired lookup field as a key.

You also use the steps below to expose a custom SQL query as the target of a lookup.

  • First create a simple query to wrap your list:
    SELECT * FROM list.MyList
  • You could also add additional filtering, joins, or group by clauses as needed.
  • Next, annotate the query with XML metadata to mark the desired target column as "isKeyField". On the XML Metadata tab of the query editor, enter:
    <tables xmlns="http://labkey.org/data/xml">
    <table tableName="MyList" tableDbType="NOT_IN_DB">
    <columns>
    <column columnName="TargetColumn">
    <isKeyField>true</isKeyField>
    </column>
    </columns>
    </table>
    </tables>
  • Save your query, and when you create lookups from other tables, your wrapped list will appear as a valid target.

Related Topics




LabKey SQL Utility Functions


LabKey SQL provides the function extensions covered in this topic to help Java module developers and others writing custom LabKey queries access various properties.

moduleProperty(MODULE_NAME, PROPERTY_NAME)

Returns a module property, based on the module and property names. Arguments are strings, so use single quotes not double.

moduleProperty values can be used in place of a string literal in a query, or to supply a container path for Queries Across Folders.

Examples

moduleProperty('EHR','EHRStudyContainer')

You can use the virtual "Site" schema to specify a full container path, such as '/home/someSubfolder' or '/Shared':

SELECT *
FROM Site.{substitutePath moduleProperty('EHR','EHRStudyContainer')}.study.myQuery

The virtual "Project." schema similarly specifies the name of the current project.

javaConstant(FULLY_QUALIFIED_CLASS_AND_FIELD_NAME)

Provides access to public static final variable values. The argument value should be a string.

Fields must be either be on classes in the java.lang package, or tagged with the org.labkey.api.query.Queryable annotation to indicate they allow access through this mechanism. All values will be returned as if they are of type VARCHAR, and will be coerced as needed. Other fields types are not supported.

Examples

javaConstant('java.lang.Integer.MAX_VALUE')
javaConstant('org.labkey.mymodule.MyConstants.MYFIELD')

To allow access to MYFIELD, tag the field with the annotation @Queryable:

public class MyConstants
{
@Queryable
public static final String MYFIELD = "some value";
}

Related Topics




Query Metadata


Tables and queries can have associated XML files that carry additional metadata information about the columns in the query. Example uses of query metadata include:
  • Column titles and data display formatting
  • Add URLs to data in cells or lookups to other tables
  • Add custom buttons that navigate to other pages or call JavaScript methods, or disable the standard buttons
  • Color coding for values that fall within a numeric range
  • Create an alias field, a duplicate of an existing field to which you can apply independent metadata properties

Topics

You can edit or add to this metadata using a variety of options:

Another way to customize query metadata is using an XML file in a module. Learn more in this topic: Modules: Query Metadata

Edit Metadata Using the User Interface

The metadata editor offers a subset of the features available in the field editor and works in the same way offering fields with properties, configuration options, and advanced settings.

  • Open the schema browser via (Admin) > Go To Module > Query.
  • Select an individual schema and query/table in the browser and click Edit Metadata.
  • Use the field editor interface to adjust fields and their properties.
    • Note that there are some settings that cannot be edited via metadata override.
    • For instance, you cannot change the PHI level of a field in this manner, so that option is inactive in the field editor.
    • You also cannot delete fields here, so will not see the icon.
  • Expand a field by clicking the . It will become a .
  • To change a column's displayed title, edit the Label property.
  • In the image above, the displayed text for the column has been changed to the longer "Average Temperature".
  • You could directly Edit Source or View Data from this interface.
  • You can also click Reset to Default to discard any changes.
  • If you were viewing a built-in table or query, you would also see an Alias Field button -- this lets you "wrap" a field and display it with a different "alias" field name. This feature is only available for built-in queries.
  • Click Save when finished.

Edit Metadata XML Source

The other way to specify and edit query metadata is directly in the source editor. When you set field properties and other options in the UI, the necessary XML is generated for you and you may further edit in the source editor. However, if you wanted to apply a given setting or format to several fields, it might be most efficient to do so directly in the source editor. Changes made to in either place are immediately reflected in the other.

  • Click Edit Source to open the source editor.
  • The Source tab shows the SQL query.
  • Select the XML Metadata tab (if it not already open).
  • Edit as desired. A few examples are below.
  • Click the Data tab to test your syntax and see the results.
  • Return to the XML Metadata tab and click Save & Finish when finished.

Examples:

Other metadata elements and attributes are listed in the tableInfo.xsd schema available in the XML Schema Reference.

Note that it is only possible to add/alter references to metadata entities that already exist in your query. For example, you can edit the "columnTitle" (aka the "Title" in the query designer) because this merely changes the string that provides the display name of the field. However, you cannot edit the "columnName" because this entity is the reference to a column in your query. Changing "columnName" breaks that reference.

For additional information about query metadata and the order of operations, see Modules: Query Metadata.

Apply a Conditional Format

In the screenshot below a conditional format has been applied to the Temp_C column -- if the value is over 37, display the value in red.

  • Click the Data tab to see some values displayed in red, then return to the XML Metadata tab.
  • You could make further modifications by directly editing the metadata here. For example, change the 37 to 39.
  • Click the Data tab to see the result -- fewer red values, if any.
  • Restore the 37 value, then click Save and Finish.
If you were to copy and paste the entire "column" section with a different columnName, you could apply the same formatting to a different column with a different threshold. For example, paste the section changing the columnName to "Weight_kg" and threshold to 80 to show the same conditional red formatting in that data. If you return to the GUI view, and select the format tab for the Weight field, you will now see the same conditional format displayed there.

Hide a Field From Users

Another example: the following XML metadata will hide the "Date" column:

<tables xmlns="http://labkey.org/data/xml"> 
<table tableName="TestDataset" tableDbType="NOT_IN_DB">
<columns>
<column columnName="date">
<isHidden>true</isHidden>
</column>
</columns>
</table>
</tables>

Use a Format String for Display

The stored value in your data field may not be how you wish to display that value to users. For example, you might want to show DateTime values as only the "Date" portion, or numbers with a specified number of decimal places. These options can be set on a folder- or site-wide basis, or using the field editor in the user interface as well as within the query metadata as described here.

Use a <formatString> in your XML to display a column value in a specific format. They are supported for SQL metadata, dataset import/export and list import/export. The formatString is a template that specifies how to format a value from the column on display output (or on export if the corresponding excelFormatString and tsvFormatString values are not set).

Use formats specific to the field type:

  • Date: DateFormat
  • Number: DecimalFormat
  • Boolean: Values can be formatted using a template of the form positive;negative;null, where "positive" is the string to display when true, "negative" is the string to display when false, and "null" is the string to display when null.
For example, the following XML:
<tables xmlns="http://labkey.org/data/xml">
<table tableName="formattingDemo" tableDbType="NOT_IN_DB">
<columns>
<column columnName="DateFieldName">
<formatString>Date</formatString>
</column>
<column columnName="NumberFieldName">
<formatString>####.##</formatString>
</column>
<column columnName="BooleanFieldName">
<formatString>Oui;Non;Nulle</formatString>
</column>
</columns>
</table>
</tables>

...would make the following formatting changes to fields in a list:

Use a Text Expression for Display

The stored value in your data field may not be how you wish to present that data to users. For example, you might store a "completeness" value but want to show the user a bit more context about that value. Expressions are constructed similar to sample naming expressions using substitution syntax. You can pull in additional values from the row of data as well and use date and number formatting patterns.

For example, if you wanted to store a weight value in decimal, but display in sentence form you might something like the following XML for that column:

<column columnName="weight">
<textExpression>${ParticipantID} weighed ${weight} Kg in ${Date:date('MMMM')}.</textExpression>
</column>

This example table would look like the first three columns of the following, with the underlying value stored in the weight column only the numeric portion of what is displayed:

ParticipantIDDateWeight(Not shown, the stored value for "Weight" column)
PT-1112021-08-15PT-111 weighed 69.1 Kg in August.69.1
PT-1072021-04-25PT-107 weighed 63.2 Kg in April.63.2
PT-1082020-06-21PT-108 weighed 81.9 Kg in June.81.9

Set a Display Width for a Column

If you want to specify a value for the display width of one of your columns (regardless of the display length of the data in the field) use syntax like the following. This will make the Description column display 400 pixels wide:

<column columnName="Description">
<displayWidth>400</displayWidth>
</column>

Display Columns in Specific Order in Entry Forms

The default display order of columns in a data entry form is generally based on the underlying order of columns in the database. If you want to present columns in a specific order to users entering data, you can rearrange the user-defined fields using the field editor. If there are built in fields that 'precede' the user-defined ones, you can use the useColumnOrder table attribute. When set to true, columns listed in the XML definition will be presented first, in the order in which they appear in the XML. Any columns not specified in the XML will be listed last in the default order.

For example, this XML would customize the entry form for a "Lab Results" dataset to list the ViralLoad and CD4 columns first, followed by the 'standard' study columns like Participant and Timepoint. Additional XML customizations can be applied to the ordered columns, but are not required.

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Lab Results" tableDbType="NOT_IN_DB" useColumnOrder="true">
<columns>
<column columnName="ViralLoad"/>
<column columnName="CD4"/>
</columns>
</table>
</tables>

Use SQL Annotations to Apply Metadata

You can directly specify some column metadata using annotations in your SQL statements, instead of having to separately add metadata via XML. These annotations are case-insensitive and can be used with or without a preceding space, i.e. both "ratio@HIDDEN" and "ratio @hidden" would be valid versions of the first example.

@hidden is the same as specifying XML <isHidden>true</isHidden>

SELECT ratio @hidden, log(ratio) as log_ratio

@nolookup is the same as <displayColumnFactory><className>NOLOOKUP</className></displayColumnFactory>. This will show the actual value stored in this field, instead of displaying the lookup column from the target table.

SELECT modifiedby, modifiedby AS userid @nolookup

@title='New Title': Explicitly set the display title/label of the column

SELECT ModifiedBy @title='Previous Editor'

@preservetitle: Don't compute title based on alias, instead keep the title/label of the selected column. (Doesn't do anything for expressions.)

SELECT ModifiedBy as PreviousEditor @preservetitle

Alias Fields

Alias fields are wrapped duplicates of an existing field in a query, to which you can apply independent metadata properties.

You can use alias fields to display the same data in alternative ways, or alternative data, such as mapped values or id numbers. Used in conjunction with lookup fields, an alias field lets you display different columns from a target table.

To create an alias field:

  • Click Alias Field on the Edit Metadata page.
  • In the popup dialog box, select a field to wrap and click Ok.
  • A wrapped version of the field will be added to the field list.
    • You can make changes to relabel or add properties as needed.
    • Notice that the details column indicates which column is wrapped by this alias.

You can also apply independent XML metadata to this wrapped field, such as lookups and joins to other queries.

Related Topics




Query Metadata: Examples


This topic provides examples of query metadata used to change the display and add other behavior to queries. Joins and Lookups: Example .query.xml files:

Auditing Level

Set the level of detail recorded in the audit log. The example below sets auditing to "DETAILED" on the Physical Exam table.

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Physical Exam" tableDbType="NOT_IN_DB">
<auditLogging>DETAILED</auditLogging>
<columns>
...
</columns>
</table>
</tables>

Conditional Formatting

The following adds a yellow background color to any cells showing a value greater than 72.

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Physical Exam" tableDbType="NOT_IN_DB">
<columns>
<column columnName="Pulse">
<columnTitle>Pulse</columnTitle>
<conditionalFormats>
<conditionalFormat>
<filters>
<filter operator="gt" value="72" />
</filters>
<backgroundColor>FFFF00</backgroundColor>
</conditionalFormat>
</conditionalFormats>
</column>
</table>
</tables>

Include Row Selection Checkboxes

To include row selection checkboxes in a query, include this:

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Demographics" tableDbType="NOT_IN_DB">
<columns>
<column columnName="RowId">
<isKeyField>true</isKeyField>
</column>
</columns>
</table>
</tables>

Show/Hide Columns on Insert and/or Update

You can control the visibility of columns when users insert and/or update into your queries using syntax like the following:

<tables xmlns="http://labkey.org/data/xml">
<table tableName="MyTableName" tableDbType="NOT_IN_DB">
<columns>
<column columnName="ReadOnlyColumn">
<shownInInsertView>false</shownInInsertView>
<shownInUpdateView>false</shownInUpdateView>
</column>
</columns>
</table>
</tables>

Be aware that if you don't show a required field to users during insert, they will not actually be able to insert data unless the value is otherwise provided.

Two other XML parameters related to hiding columns and making values editable are:

<isHidden>true</isHidden>
<isUserEditable>false</isUserEditable>

URL to Open in New Tab

A field can use a URL property to open a target URL when the value is clicked. To make this target open in a new tab, use the following syntax to set the target="_blank" attribute. In this example, the link URL has been set using the UI for the "City" column:

<column columnName="City">
<urlTarget>_blank</urlTarget>
</column>

URL Thumbnail Display

Using query metadata provided in a module, you can display images with optional custom thumbnails in a data grid.

For example, if you have a store of PNG images and want to display them in a grid, you can put them in a fixed location and refer to them from the grid by URL. The column type in the grid would be a text string, but the grid would display the image. Further, you could provide separate custom thumbnails instead of the miniature version LabKey would generate by default.

The module code would use a displayColumnFactory similar to this example, where the properties are defined below.

<column columnName="image"> 
<displayColumnFactory>
<className>org.labkey.api.data.URLDisplayColumn$Factory</className>
<properties>
<property name="thumbnailImageWidth">55px</property>
<property name="thumbnailImageUrl">_webdav/my%20studies/%40files/${image}</property>
<property name="popupImageUrl">_webdav/my%20studies/%40files/${popup}</property>
<property name="popupImageWidth">250px</property>
</properties>
</displayColumnFactory>
<url>/labkey/_webdav/my%20studies/%40files/${popup}</url>
</column>

Properties available in a displayColumnFactory:

  • thumbnailImageUrl (optional) - The URL string expression for the image that will be displayed inline in the grid. If the string expression evaluates to null, the column URL will be used instead. Defaults to the column URL.
  • thumbnailImageWidth (optional) - The size of the image that will be displayed in the grid. If the property is present but the value is empty (or “native”), no css scaling will be applied to the image. Defaults to “32px”.
  • thumbnailImageHeight (optional) - The image height
  • popupImageHeight (optional) - The image height
  • popupImageUrl (optional) - The URL string expression for the popup image that is displayed on mouse over. If not provided, the thumbnail image URL will be used if the thumbnail image is not displayed at native resolution, otherwise the column URL will be used.
  • popupImageWidth (optional) - The size of the popup image that will be displayed. If the property is present but the value is empty (or “native”), no css scaling will be applied to the image. Defaults to “300px”.
Using the above example, two images would exist in the 'mystudies' file store: a small blue car image (circled) provided as the thumbnail and the white car image provided as the popup.

Join a Query to a Table

If you have a query providing calculations for a table, you can add a field alias to the table, then use that to join in your query. In this example, the query "BMI Query" is providing calculations. We added an alias of the "lsid" field, named it "BMI" and then added this metadata XML making that field a foreign key (fk) to the "BMI Query".

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Physical Exam" tableDbType="NOT_IN_DB">
<columns>
<column columnName="BMI" wrappedColumnName="lsid">
<fk>
<fkDbSchema>study</fkDbSchema>
<fkTable>BMI Query</fkTable>
<fkColumnName>LSID</fkColumnName>
</fk>
</column>
</columns>
</table>
</tables>

Once Physical Exam is linked to BMI Query via this foreign key, all of the columns in BMI Query are made available in the grid customizer. Click Show Hidden Fields to see your wrapped field (or include <isHidden>false</isHidden>), and expand it to show the fields you want.

Because these fields are calculated from other data in the system, you will not see them when inserting (or editing) your table.


Premium Resource Available

Subscribers to premium editions of LabKey Server can learn more with a more detailed example in this topic:


Learn more about premium editions

Lookup to a Custom SQL Query

If you want to create a lookup to a SQL Query that you have written, first annotate a column in the query as the primary key field, for example:

<tables xmlns="http://labkey.org/data/xml">
<table tableName="MyQuery" tableDbType="NOT_IN_DB">
<columns>
<column columnName="MyIdColumn">
<isKeyField>true</isKeyField>
</column>
</columns>
</table>
</tables>

You can then add a field in a table and create a lookup into that query. This can be used both for controlling user input (the user will see a dropdown if you include this field in the insert/update view) as well as for surfacing related columns from the query.

Lookup Display Column

By default, a lookup will display the first non-primary-key text column in the lookup target. This may result in unexpected results, depending on the structure of your lookup target. To control which display column is used, supply the fkDisplayColumnName directly in the XML metadata for the table where you want it displayed:

<tables xmlns="http://labkey.org/data/xml">
<table tableName="Demographics" tableDbType="NOT_IN_DB">
<columns>
<column columnName="Gender">
<columnTitle>Gender Code</columnTitle>
<fk>
<fkDisplayColumnName>GenderCode</fkDisplayColumnName>
<fkTable>Genders</fkTable>
<fkDbSchema>lists</fkDbSchema>
</fk>
</column>
</columns>
</table>
</tables>

Filtered Lookups

If you want to filter a lookup column during an insert or update operation, you can do so using XML metadata.

For example, consider a list of instruments shared across many departments, each one associated with a specific type of assay (such as elispot) and marked as either in use or not. If your department wants an input form to draw from this shared list, but avoid having an endless menu of irrelevant instruments, you can limit insert choices for an "Instrument" column to only those for the 'elispot' assay that are currently in use by using a filterGroup:

<column columnName="Instrument">
<fk>
<fkColumnName>key</fkColumnName>
<fkTable>sharedInstrumentList</fkTable>
<fkDbSchema>lists</fkDbSchema>
<filters>
<filterGroup>
<filter column="assay" value="elispot" operator="eq"/>
<filter column="inuse" value="true" operator="eq"/>
</filterGroup>
</filters>
</fk>
</column>

In the above example, the same filter is applied to both insert and update. If you wanted to have different behavior for these two operations, you could separate them like this:

<column columnName="Instrument">
<fk>
<fkColumnName>key</fkColumnName>
<fkTable>sharedInstrumentList</fkTable>
<fkDbSchema>lists</fkDbSchema>
<filters>
<filterGroup operation="insert">
<filter column="assay" value="elispot" operator="eq"/>
<filter column="inuse" value="true" operator="eq"/>
</filterGroup>
<filterGroup operation="update">
<filter column="inuse" value="true" operator="eq"/>
</filterGroup>
</filters>
</fk>
</column>

Filter groups like this can be applied to a list, dataset, or any other table that can accept an XML metadata override.

Multi-Value Foreign Keys (Lookups)

LabKey has basic support for multi-value foreign keys. The multi-valued column renders as a comma separated list in the grid and as a JSON array when using the client API. To set up a multi-value foreign key you need a source table, a junction table, and a target value table. The junction table needs to have a foreign key pointing at the source table and another pointing at the target value table.

For example, you can create a Reagent DataClass table as the source, a ReagentSpecies List as the junction table, and a Species List as the lookup target table. Once the tables have been created, edit the metadata XML of the source table to add a new wrapped column over the primary key of the source table and create a lookup that targets the junction table. The wrapped column has a foreign key with these characteristics:

  1. Annotated as multi-valued (<fkMultiValued>junction</fkMultiValued>).
  2. Points to the junction table's column with a foreign key back to the source table (fkColumnName).
  3. Indicates which foreign key of the junction table should be followed to the target value table (fkJunctionLookup).
  4. This example shows the wrappedColumnName "Key", which is the default used for a list. If you are wrapping a column in a DataClass, the wrappedColumnName will be "RowId".
<tables xmlns="http://labkey.org/data/xml">
<table tableName="Book" tableDbType="NOT_IN_DB">
<columns>
<column columnName="Authors" wrappedColumnName="Key">
<isHidden>false</isHidden>
<isUserEditable>true</isUserEditable>
<shownInInsertView>true</shownInInsertView>
<shownInUpdateView>true</shownInUpdateView>
<fk>
<fkDbSchema>lists</fkDbSchema>
<fkTable>BookAuthorJunction</fkTable>
<fkColumnName>BookId</fkColumnName>
<fkMultiValued>junction</fkMultiValued>
<fkJunctionLookup>AuthorId</fkJunctionLookup>
</fk>
</column>
</columns>
</table>
</tables>

When viewing the multi-value foreign key in the grid, the values will be separated by commas. When using the query APIs such as LABKEY.Query.selectRows, set requiredVersion to 16.2 or greater:

LABKEY.Query.selectRows({
schemaName: 'lists',
queryName: 'Book',
requiredVersion: 17.1
});

The response format will indicate the lookup is multi-valued in the metaData section and render the values as a JSON array. The example below has been trimmed for brevity:

{
"schemaName" : [ "lists" ],
"queryName" : "Book",
"metaData" : {
"id" : "Key",
"fields" : [ {
"fieldKey" : [ "Key" ]
}, {
"fieldKey" : [ "Name" ]
}, {
"fieldKey" : [ "Authors" ],
"lookup" : {
"keyColumn" : "Key",
"displayColumn" : "AuthorId/Name",
"junctionLookup" : "AuthorId",
"queryName" : "Author",
"schemaName" : [ "lists" ],
"multiValued" : "junction"
}
} ]
},
"rows" : [ {
"data" : {
"Key" : { "value" : 1 },
"Authors" : [ {
"value" : 1, "displayValue" : "Terry Pratchett"
}, {
"value" : 2, "displayValue" : "Neil Gaiman"
} ],
"Name" : { "value" : "Good Omens" }
}
}, {
"data" : {
"Key" : { "value" : 2 },
"Authors" : [ {
"value" : 3, "displayValue" : "Stephen King"
}, {
"value" : 4, "displayValue" : "Peter Straub"
} ],
"Name" : { "value" : "The Talisman" }
}
} ],
"rowCount" : 2
}

There are some limitations of the multi-value foreign keys, however. Inserts, updates, and deletes must be performed on the junction table separately from the insert into either the source or target table. In future versions of LabKey Server, it may be possible to perform an insert on the source table with a list of values and have the junction table populated for you. Another limitation is that lookups on the target value are not supported. For example, if the Author had a lookup column named "Nationality", you will not be able to traverse the Book.Author.Nationality lookup.

Downloadable Examples

  • kinship.query.xml
    • Disables the standard insert, update, and delete buttons/links with the empty <insertUrl /> and other tags.
    • Configures lookups on a couple of columns and hides the RowId column in some views.
    • Adds a custom button "More Actions" with a child menu item "Limit To Animals In Selection" that calls a JavaScript method provided in a referenced .js file.
  • Data.query.xml
    • Configures columns with custom formatting for some numeric columns, and color coding for the QCFlag column.
    • Adds multiple menu options under the "More Actions" button at the end of the button bar.
  • Formulations.query.xml
    • Sends users to custom URLs for the insert, update, and grid views.
    • Retains some of the default buttons on the grid, and adds a "Delete Formulations" button between the "Paging" and "Print" buttons.
  • encounter_participants.query.xml
  • AssignmentOverlaps.query.xml
  • Aliquots.query.xml & performCellSort.html
    • Adds a button to the Sample Types web part. When the user selects samples and clicks the button, the page performCallSort.html is shown, where the user can review the selected records before exporting the records to an Excel file.
    • To use this sample, place Aliquots.query.xml in a module's ./resources/queries/Samples directory. Rename Aliquots.query.xml it to match your sample type's name. Edit the tableName attribute in the Aliquots.query.xml to match your sample type's name. Replace the MODULE_NAME placeholder with the name of your module. Place the HTML file in your module's ./resources/views directory. Edit the queryName config parameter to match the sample type's name.

Related Topics




Edit Query Properties


Custom SQL queries give you the ability to flexibly present the data in a table in any way you wish. To edit a query's name, description, or visibility properties, follow the instructions in this topic.

Permissions: To edit a custom SQL query, you must have both the "Editor" role (or higher) and one of the developer roles "Platform Developer" or "Trusted Analyst".

Edit Properties

  • Go to the Schema Browser: (Admin) > Go To Module > Query
  • Select the schema and query/table of interest and then click Edit Properties.
  • The Edit query properties page will appear, for example:

Available Query Properties

Name: This value appears as the title of the query/table in the Schema Browser, data grids, etc.

Description: Text entered here will be shown with your query in the Schema Browser.

Available in child folders: Queries live in a particular folder on LabKey Server, but can be marked as inheritable by setting this property to 'yes'. Note that the query will only be available in child folders containing the matching base schema and table.

Hidden from the user: If you set this field to 'yes', then the query will no longer appear in the Schema Browser, or other lists of available queries. It will still appear on the list of available base queries when you create a new query.

Related Topics




Trace Query Dependencies


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey.

View Query Dependencies

Query dependencies are traced and noted in the schema browser in a Dependency Report, below the lists of columns and indices. Particularly for user-defined queries derived from built in queries and tables it can be helpful to track the sources and dependents when you make any query changes. Shown below, the "AverageTemp + Physical Exam" query depends on both the table "Physical Exam" and another query, "AverageTempPerParticipant". Notice the different icon indicators. In turn, this query was used to base another query "Physical Exam + TempDelta" shown under Dependents.

If there are no dependencies for the query, this section will not be shown.

Trace Cross-Folder Query Dependencies

  • Select (Admin) > Go To Module > Query.
  • Click Cross Folder Dependencies.
  • Select the scope of the analysis:
    • Site Level: Perform analysis across all folders and projects on the server.
    • Current Project Level: Perform analysis within the current project and all subfolders of it.
  • Click Start Analysis.

Once the analysis begins, the entire schema browser will be masked and a progress indicator will be shown. Folder names will be displayed as they are analyzed. When the analysis is complete, a message box will appear. Click OK to close.

All query detail tabs that are currently open will have their dependency report refreshed to display any information that was added from the cross folder analysis.

Related Topics




Query Web Part


The Query web part can be created by a page administrator and used to display specific information of use to users, in a flexible interface panel that can show either:
  • A list of all queries in a particular schema
  • A specific query or grid view

Add a Query Web Part

  • Navigate to where you want to display the query.
  • Enter > Page Admin Mode.
  • Click Select Web Part drop-down menu at the bottom left of the page, select “Query” and click Add.

Provide the necessary information:

  • Web Part Title: Enter the title for the web part, which need not match the query name.
  • Schema: Pull down to select from available schema.
  • Query and View: Choose whether to:
  • Allow user to choose query?: If you select "Yes", the web part will allow the user to change which query is displayed. Only queries the user has permission to see will be available.
  • Allow user to choose view?: If you select "Yes", the web part will allow the user to change which grid view is displayed. Only grid views the user has permission to see will be available.
  • Button bar position: Select whether to display web part buttons at the top of the grid. Options: Top, None.
Once you've customized the options:
  • Click Submit.
  • Click Exit Admin Mode.

List of All Queries

When you select the list of queries in the schema, the web part will provide a clickable list of all queries the user has access to view.

Show Specific Query and View

The web part for a specific query and view will show the grid in a web part similar to the following. If you selected the option to allow the user to select other queries (or views) you will see additional menus:

Related Topics




SQL Synonyms


Premium Feature — Available in the Professional and Enterprise Editions of LabKey Server. Learn more or contact LabKey.

SQL synonyms provide a way to connect to a database with alternate names/aliases for database objects such as tables, views, procedures, etc. The alternate names form a layer of abstraction between LabKey Server and the underlying database, providing the following benefits:

  • Easier integration with external databases. Naming differences between the client (LabKey Server) and the resource (the database) are no longer a barrier to connection.
  • Insulation from changes in the underlying database. If the names of database resources change, you can maintain the connection without changing core client code.
  • Hides the underlying database. You can interact with the database without knowing its exact underlying structure.
SQL synonyms are currently supported for MS SQL Server and for table names only.

Set Up

To set up SQL synonyms, first add a new datasource to the labkey.xml file. For details see External Microsoft SQL Server Data Sources.

Related Topics




LabKey SQL Examples





JOIN Queries


This topic covers using JOIN in LabKey SQL to bring data together. All data is represented in tables, or queries. A list or dataset or user defined query are all examples of queries. Queries are always defined in a specific folder (container) and schema, but can access information from other queries, schema, and folders using JOINs.

JOIN Columns from Different Tables

Use JOIN to combine columns from different tables.

The following query combines columns from the Physical Exam and Demographics tables.

SELECT PhysicalExam.ParticipantId,
PhysicalExam.weight_kg,
Demographics.Gender,
Demographics.Height
FROM PhysicalExam INNER JOIN Demographics ON PhysicalExam.ParticipantId = Demographics.ParticipantId

Note that when working with datasets in a study, every dataset also has an implied "Datasets" column that can be used in select statements instead of join. See an example in this topic: More LabKey SQL Examples

Handle Query/Table Names with Spaces

When a table or query name has spaces in it, use "double quotes" around it in your query. For example, if the table name "Physical Exam" had a space in it, the above example would look like:

SELECT "Physical Exam".ParticipantId,
"Physical Exam".weight_kg,
Demographics.Gender,
Demographics.Height
FROM "Physical Exam" INNER JOIN Demographics ON "Physical Exam".ParticipantId = Demographics.ParticipantId

Handle Duplicate Column Names

When joining two tables that have some column names in common, the duplicates will be disambiguated by appending "_1", "_2" to the joined column names as needed. The first time the column name is seen in the results, no number is appended.

See an example here: More LabKey SQL Examples

Combine Tables with FULL OUTER JOIN

The following example shows how to join all rows from two datasets (ViralLoad and ImmuneScore), so that the columns RNACopies and Measure1 can be compared or visualized. In this example, we handle the duplicate column names (ParticipantId and VisitID) using CASE/WHEN statements, though given the join is always on these two columns matching, we could also just select from either table directly.

SELECT 
CASE
WHEN ViralLoad.ParticipantId IS NULL THEN ImmuneScore.ParticipantId
WHEN ViralLoad.ParticipantId IS NOT NULL THEN ViralLoad.ParticipantId
END AS
ParticipantId,
CASE
WHEN ViralLoad.VisitID IS NULL THEN ImmuneScore.VisitID
WHEN ViralLoad.VisitID IS NOT NULL THEN ViralLoad.VisitID
END AS
VisitID,
ViralLoad.RNACopies,
ImmuneScore.Measure1,
FROM ViralLoad
FULL OUTER JOIN ImmuneScore ON
ViralLoad.ParticipantId = ImmuneScore.ParticipantId AND ViralLoad.VisitID = ImmuneScore.VisitID

CROSS JOIN

Cross joins give you all possible combinations between two different tables (or queries).

For example, suppose you have two tables Things and Items:

Thing
A
B

Item
1
2

Then cross join will provide the set of all possible combinations between the two tables:

ThingItem
A1
A2
B1
B2

SELECT Things.Name AS Thing,
Items.Name AS Item
FROM Items
CROSS JOIN Things

JOIN Queries Across Schema

You can join tables from two different schemas in the same data source by using the syntax "JOIN <SCHEMA>.<TABLE>" to refer to the table in another schema.

Note: you cannot create joins across data sources. To join native tables to data from an external schema, you must first ETL the external data into the local database. Learn more here:

For example, suppose you have the following tables in your local database, one in the study schema, the other in the lists schema:

study.Demographics

ParticipantIdLanguageGender
PT-101EnglishM
PT-102FrenchF
PT-103GermanF

lists.Languages

LanguageNameTranslatorNameTranslatorPhone
EnglishMonica Payson(206) 234-4321
FrenchEtiene Anson206 898-4444
GermanGundula Prokst(444) 333-5555

The following SQL JOIN creates a table that combines the data from both source tables.

SELECT Demographics.ParticipantId,
Demographics.Language,
Languages.TranslatorName,
Languages.TranslatorPhone,
Demographics.Gender
FROM Demographics
JOIN lists.Languages ON Demographics.Language=Languages.LanguageName

The data returned:

ParticipantIdLanguageTranslatorNameTranslatorPhoneGender
PT-101EnglishMonica Payson(206) 234-4321M
PT-102FrenchEtiene Anson206 898-4444F
PT-103GermanGundula Prokst(444) 333-5555F

JOIN Queries Across Folders

You can join queries from two different containers by adding the server path to the syntax in your JOIN statement. For example, if you had a project named "Other" containing a folder named "Folder" you could use a list in it from another folder by following this example:

SELECT Demographics.ParticipantId,
Demographics.Language,
Languages.TranslatorName,
Languages."PhoneNumber",
FROM Demographics
JOIN "/Other/Folder".lists.Languages ON Demographics.Language=Languages.Language

Learn more about queries across folders in this topic:

Related Topics




Calculated Columns


This topic explains how to include a calculated column in a query using SQL expressions. To use the examples on this page, install the example study downloadable from this topic.

Example #1: Pulse Pressure

In this example we use SQL to add a column to a query based on the Physical Exam dataset. The column will display "Pulse Pressure" -- the change in blood pressure between contractions of the heart muscle, calculated as the difference between systolic and diastolic blood pressures.

Create a Query

  • Navigate to (Admin) > Go To Module > Query.
  • Select a schema to base the query on. (For this example, select the study schema.)
  • Click Create New Query.
  • Create a query with the name of your choice, based on some dataset. (For this example, select the Physical Exam dataset.)
  • Click Create and Edit Source.

Modify the SQL Source

The default query provided simply selects all the columns from the chosen dataset. Adding the following SQL will create a column with the calculated value we seek. Note that if your query name included a space, you would put double quotes around it, like "Physical Exam". Subtract the diastolic from the systolic to calculate pulse pressure:

PhysicalExam.systolicBP-PhysicalExam.diastolicBP as PulsePressure

The final SQL source should look like the following, with the new line added at the bottom. Remember to add a comma on the previous line:

SELECT PhysicalExam.ParticipantId,
PhysicalExam.date,
PhysicalExam.weight_kg,
PhysicalExam.temperature_C,
PhysicalExam.systolicBP,
PhysicalExam.diastolicBP,
PhysicalExam.pulse,
PhysicalExam.systolicBP-PhysicalExam.diastolicBP as PulsePressure
FROM PhysicalExam
  • Click Save and Finish.
  • Notice that LabKey Server has made a best guess at the correct column label, adding a space to "Pulse Pressure".
To reopen and further edit your query, return to the schema browser, either using the study Schema link near the top of the page or the menu. Your new query will be listed in the study schema alphabetically.

Example #2: BMI

The following query pair shows how to use intermediate and related queries to calculate the Body Mass Index (BMI). In this scenario, we have a study where height is stored in a demographic dataset (does not change over time) and weight is recorded in a physical exam dataset. Before adding this BMI calculation, we create an intermediate query "Height and Weight" using the study "Datasets" table alignment to provide weight in kg and height in meters for our calculation.

Height and Weight query:

SELECT PhysicalExam.ParticipantId,
PhysicalExam.date,
PhysicalExam.weight_kg,
TRUNCATE(CAST(Datasets.Demographics.height AS DECIMAL)/100, 2) as height_m,
FROM PhysicalExam

BMI query:

SELECT "Height and Weight".ParticipantId,
"Height and Weight".Date,
"Height and Weight".weight_kg,
"Height and Weight".height_m,
ROUND(weight_kg / (height_m * height_m), 2) AS BMI
FROM "Height and Weight"

Display the Calculated Column in the Original Table

Premium Resource Available

Subscribers to premium editions of LabKey Server can learn how to display a calculated column from a query in the original table following this topic:


Learn more about premium editions

Related Topics




Premium Resource: Display Calculated Columns from Queries





Pivot Queries


A pivot query helps you summarize and revisualize data in a table. Data can be grouped or aggregated to help you focus on a particular aspect of your data. For example a pivot table can help you see how many data points of a particular kind are present, or it can represent your data by aggregating it into different categories.

Note that PIVOT queries cannot be used with the compliance module's logging of all query access. For details, see Compliance: Logging.

To try this yourself, you can download this sample file and import it as a Standard assay type. Remember your assay design name; we use "PivotDemo" below.

Create a new SQL query and edit its source.

  • Select (Admin) > Go To Module > Query.
  • Select a schema. In our example, we used the assay schema, specifically "assay.General.PivotDemo" in our example.
  • Click Create New Query.
  • Name it and confirm the correct query/table is selected. For our assay, we chose "Data" (the assay results).
  • Click Create and Edit Source.

Syntax for PIVOT query

A PIVOT query is essentially a SELECT specifying which columns you want and how to PIVOT and GROUP them. To write a pivot query, follow these steps.

(1) Start with a base SELECT query.

SELECT ParticipantID, Date, RunName, M1
FROM Data

(2) Identify the data cells you want to pivot and how. In this example, we focus on the values in the RunName column, to separate M1 values for each.

(3) Select an aggregating function to handle any non-unique data, even if you do not expect to need it. MAX, MIN, AVG, and SUM are possibilities. Here we have only one row for each participant/date/run combination, so all would produce the same result, but in other data you might have several sets with multiple values. When aggregating, we can also give the column a new name, here MaxM1.

(4) Identify columns which remain the same to determine the GROUP BY clause.

SELECT ParticipantId, Date, RunName, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName

(5) Finally, pivot the cells.

SELECT ParticipantId, Date, RunName, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName
PIVOT MaxM1 by RunName

(6) You can focus on particular values using IN. In our example, perhaps we want to see only two runs:

SELECT ParticipantId, Date, RunName, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName
PIVOT MaxM1 by RunName IN ('Run1', 'Run2')

Note that pivot column names are case-sensitive. You may need to use lower() or upper() in your query to work around this issue if you have two values who differ only by letter case.

Grouped Headers

You can add additional aggregators and present two pivoted columns, here we show both an minimum and a maximum M1 value for each participant/date/run combination. In this case they are the same, but the pattern can be applied to varying data.

SELECT ParticipantId, Date, RunName, Min(M1) as MinM1, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName
PIVOT MinM1, MaxM1 by RunName IN ('Run1', 'Run2')

"Summary" Columns

"Summary" columns are those columns not included in the group-by or pivoted list. When generating the pivot, these summary columns are aggregated as follows:

  • a COUNT or SUM aggregate summary column is wrapped with SUM
  • a MIN or MAX is wrapped with a MIN or MAX
For example:

SELECT
-- group by columns
AssignedTo,
Type,

-- summary columns. Turned into a SUM over the COUNT(*)
COUNT(*) AS Total,

-- pivoted columns
SUM(CASE WHEN Status = 'open' THEN 1 ELSE 0 END) AS "Open",
SUM(CASE WHEN Status = 'resolved' THEN 1 ELSE 0 END) AS Resolved

FROM issues.Issues
WHERE Status != 'closed'
GROUP BY AssignedTo, Type
PIVOT "Open", Resolved BY Type IN ('Defect', 'Performance', 'To Do')

Examples

Another practical application of a pivot query is to display a list of how many issues of each priority are open for each area.

Example 1

See the result of this pivot query: Pivot query on the Issues table

SELECT Issues.Area, Issues.Priority, Count(Issues.IssueId) 
AS CountOfIssues
FROM Issues
GROUP BY Issues.Area, Issues.Priority
PIVOT CountOfIssues BY Priority IN (1,2,3,4)

Example 2

See the result of this pivot query: Pivot query on the Issues table

SELECT
-- Group By columns
AssignedTo,
Type,

-- Summary columns. Turned into a SUM over the COUNT(*)
COUNT(*) AS Total,

-- Pivoted columns
SUM(CASE WHEN Status = 'open' THEN 1 ELSE 0 END) AS "Open",
SUM(CASE WHEN Status = 'resolved' THEN 1 ELSE 0 END) AS Resolved

FROM issues.Issues
WHERE Status != 'closed'
GROUP BY AssignedTo, Type
PIVOT "Open", Resolved BY Type IN ('Defect', 'Performance', 'To Do')

Example 3

SELECT ParticipantId, date, Analyte, 
Count(Analyte) AS NumberOfValues,
AVG(FI) AS AverageFI,
MAX(FI) AS MaxFI
FROM "Luminex Assay 100"
GROUP BY ParticipantID, date, Analyte
PIVOT AverageFI, MaxFI BY Analyte

Example 4

SELECT SupportTickets.Client AS Client, SupportTickets.Status AS Status,
COUNT(CASE WHEN SupportTickets.Priority = 1 THEN SupportTickets.Status END) AS Pri1,
COUNT(CASE WHEN SupportTickets.Priority = 2 THEN SupportTickets.Status END) AS Pri2,
COUNT(CASE WHEN SupportTickets.Priority = 3 THEN SupportTickets.Status END) AS Pri3,
COUNT(CASE WHEN SupportTickets.Priority = 4 THEN SupportTickets.Status END) AS Pri4,
FROM SupportTickets
WHERE SupportTickets.Created >= (curdate() - 7)
GROUP BY SupportTickets.Client, SupportTickets.Status
PIVOT Pri1, Pri2, Pri3, Pri4 BY Status



Queries Across Folders


Cross-Folder Queries

You can perform cross-folder queries by identifying the folder that contains the data of interest during specification of the dataset. The path of the dataset is composed of the following components, strung together with a period between each item:

  • Project - This is literally the word Project, which is a virtual schema and resolves to the current folder's project.
  • Path to the folder containing the dataset, surrounded by quotes. This path is relative to the current project. So a dataset located in the Home > Tutorials > Demo subfolder would be referenced using "Tutorials/Demo/".
  • Schema name - In the example below, this is study
  • Dataset name - Surrounded by quotes if there are spaces in the name. In the example below, this is "Physical Exam"
Note that LabKey Server enforces a user's security roles when they view a cross-folder query. To view a cross folder/container query, the user must have the Reader role in each of the component folders/containers which the query draws from.

Example

The Edit SQL Query Source topic includes a simple query on the "Physical Exam" dataset which looks like this, when the dataset is in the local folder:

SELECT "Physical Exam".ParticipantID, ROUND(AVG("Physical Exam".Temp_C), 1) AS AverageTemp
FROM "Physical Exam"
GROUP BY "Physical Exam".ParticipantID

This query could reference the same dataset from a sibling folder within the same project. To do so, you would replace the string used to identify the dataset in the FROM clause ("Physical Exam" in the query used in this topic) with a fully-specified path. For this dataset, you would use:

Project."Tutorials/Demo/".study."Physical Exam"

Using the query from the example topic as a base, the cross-folder verson would read:

SELECT "Physical Exam".ParticipantID, ROUND(AVG("Physical Exam".Temp_C), 1) AS AverageTemp
FROM Project."Tutorials/Demo/".study."Physical Exam"
GROUP BY "Physical Exam".ParticipantID

Cross-Project Queries

You can perform cross-project queries using the full path for the project and folders that contain the dataset of interest. To indicate that a query is going across projects, use a full path, starting with a slash. The syntax is “/<FULL FOLDER PATH>”.<SCHEMA>.<QUERY>

  • Full path to the folder containing the dataset, surrounded by quotes. This lets you access an arbitrary folder, not just a folder in the current project. You can use a leading / slash in the quoted path to indicate starting at the site level. You can also use the virtual schema "Site." to lead your path, as shown below in the module property example. For example, a dataset located in the Home > Tutorials > Demo subfolder would be referenced from another project using "/Home/Tutorials/Demo/".
  • Schema name - In the example below, this is study
  • Dataset name - Surrounded by quotes if there are spaces in the name. In the example below, this is "Physical Exam"
Example 1

The example shown above can be rewritten using cross-project syntax by including the entire path to the dataset of interest in the FROM clause, preceded by a slash.

“/Home/Tutorials/Demo/”.study."Physical Exam"

Using the query from the example topic as a base, the cross-folder verson would read:

SELECT "Physical Exam".ParticipantID, ROUND(AVG("Physical Exam".Temp_C), 1) AS AverageTemp
FROM "/Home/Tutorials/Demo/".study."Physical Exam"
GROUP BY "Physical Exam".ParticipantID

Example 2

SELECT Demographics.ParticipantId,
Demographics.Language,
Languages.TranslatorName,
Languages."PhoneNumber",
FROM Demographics
JOIN "/Other/Folder".lists.Languages ON Demographics.Language=Languages.Language

Use ContainerFilter in SQL

You can annotate individual tables in the FROM clause with an optional container filter if you want to restrict or expand the scope of the individual table in a larger query. For example, this would allow an issues report to specify that the query should use CurrentAndSubfolder without having to create a corresponding custom view. Further, the defined container filter can not be overridden by a custom view.

The syntax for this is:

SELECT * FROM Issues [ContainerFilter='CurrentAndSubfolders']

Find the list of possible values for the containerFilter here. The value must be surrounded by single quotes and capitalized exactly as expected.

This option can be used in combination with other SQL syntax, including the cross-folder and cross-project options above.

Use Module Properties

You can specify the path for a cross-folder or cross-project query using a module property, allowing you to define a query to apply to a different path or dataset target in each container. For example, if you define the module property "DatasetPathModuleProp" in the module "MyCustomModule", the syntax to use would be:

SELECT
source.ParticipantId
FROM Site.{moduleProperty('MyCustomModule', 'DatasetPathModuleProp')}.study.source

In each container, you would define this module property to be the full path to the desired location.

Rules about interpreting schemas and paths:

"Site" and "Project" are virtual schemas that let you resolve folder names in the same way as schema names. It is inferred that no schema will start with /, so if a string starts with / it is assumed to mean the site folder and the path will be resolved from there.

While resolving a path, if we have a folder schema in hand, and the next schema contains "/", it will be split and treated as a path.

Fields with Dependencies

A few LabKey fields/columns have dependencies. To use a field with dependencies in a custom SQL query, you must explicitly include supporting fields.

To use Assay ID in a query, you must include the run's RowId and Protocol columns. You must also use these exact names for the dependent fields. RowId and Protocol provide the Assay ID column with data for building its URL.

If you do not include the RowId and Protocol columns, you will see an error for the Run Assay ID field. The error looks something like this:

"KF-07-15: Error: no protocol or run found in result set."

Related Topics




Parameterized SQL Queries


LabKey Server lets you add parameters to your SQL queries, using the PARAMETERS keyword. When a user views the query, they will first assign values to the parameters (seeing any default you provide) and then execute the query using those values.

Example Parameterized SQL Query

The following SQL query defines two parameters, MinTemp and MinWeight:

PARAMETERS
(
MinTemp DECIMAL DEFAULT 37,
MinWeight DECIMAL DEFAULT 90
)


SELECT PhysicalExam.ParticipantId,
PhysicalExam.date,
PhysicalExam.weight_kg,
PhysicalExam.temperature_C
FROM PhysicalExam
WHERE temperature_C >= MinTemp and weight_kg >= MinWeight

By default, parameterized queries are hidden in the Schema Browser. Select Show Hidden Schemas and Queries to view. Go to the Schema Browser, and look in the far lower left. For details, see SQL Query Browser.

Note: Only constants are supported for default parameter values.

Example Using the URL to Pass Parameter Values

To pass a parameter value to the query on the URL use the syntax:

&query.param.PARAMETER_NAME=VALUE

For example, the following URL sets the parameter '_testid' to 'Iron'. Click below to execute the query.

https://training-01.labkey.com/Demos/SQL%20Examples/query-executeQuery.view?schemaName=lists&query.queryName=PARAM_Zika23Chemistry&query.param._testid=Iron

Example API Call to the Parameterized Query

You can pass in parameter values via the JavaScript API, as shown below:

<div id="div1"></div>
<script type="text/javascript">

function init() {

var qwp1 = new LABKEY.QueryWebPart({
renderTo: 'div1',
title: "Parameterized Query Example",
schemaName: 'study',
queryName: 'ParameterizedQuery',
parameters: {'MinTemp': '36', 'MinWeight': '90'}
});
}
</script>

The parameters are written into the request URL as follows:

query.param.MinTemp=36&query.param.MinWeight=90

User Interface for Parameterized SQL Queries

You can also pass in values using a built-in user interface. When you view a parameterized query in LabKey Server, a form is automatically generated, where you can enter values for each parameter.

  • Go to the Schema Browser: (Admin) > Go To Module > Query.
  • On the lower left corner, select Show Hidden Schemas and Queries. (Parameterized queries are hidden by default.)
  • Locate and select the parameterized query.
  • Click View Data.
  • You will be presented with a form, where you can enter values for the parameters:

ETLs and Parameterized SQL Queries

You can also use parameterized SQL queries as the source queries for ETLs. Pass parameter values from the ETL into the source query from inside the ETL's config XML file. For details see ETL: Examples.

Related Topics




More LabKey SQL Examples


This topic contains additional examples of how to use more features of LabKey SQL: Also see:

Query Columns with Duplicate Names

When joining two tables that have some column names in common, the duplicates will be disambiguated by appending "_1", "_2" to the joined column names as needed. The first time the column name is seen in the results, no number is appended.

For example, the results for this query would include columns named "EntityId", etc. from c1, and "EntityId_1", etc. from c2:

SELECT * FROM core.Containers c1 INNER JOIN core.Containers c2 ON
c1.parent = c2.entityid

Note that the numbers are appended to the field key, not the caption of a column. The user interface displays the caption, and thus may omit the underscore and number. If you hover over the column, you will see the field key, and if you export data with "Column headers=Field key" the numbered column names will be included.

"Datasets" Special Column in Studies

SQL queries based on study datasets can make use of the special column "Datasets", which gives access to all datasets in the current study.

For example, the following query on the PhysicalExam dataset uses the "Datasets" special column to pull in data from the Demographics dataset.

SELECT PhysicalExam.ParticipantId,
PhysicalExam.Weight_kg,
PhysicalExam.Datasets.Demographics.Height_cm,
PhysicalExam.Datasets.Demographics.Gender
FROM PhysicalExam

"Datasets" provides a shortcut for queries that would otherwise use a JOIN to pull in data from another dataset. The query above can be used instead of the JOIN style query below:

SELECT PhysicalExam.ParticipantId,
PhysicalExam.Weight_kg,
Demographics.Height_cm,
Demographics.Gender
FROM PhysicalExam
JOIN Demographics
ON PhysicalExam.ParticipantId = Demographics.ParticipantId

Use "GROUP BY" for Calculations

The GROUP BY function is useful when you wish to perform a calculation on a table that contains many types of items, but keep the calculations separate for each type of item. You can use GROUP BY to perform an average such that only rows that are marked as the same type are grouped together for the average.

For example, what if you wish to determine an average for each participant in a large study dataset that spans many participants and many visits. Simply averaging a column of interest across the entire dataset would produce a mean for all participants at once, not each participant. Using GROUP BY allows you to determine a mean for each participant individually.

GROUP BY Example: Average Temp Per Participant

The GROUP BY function can be used on the PhysicalExam dataset to determine the average temperature for each participant across all of his/her visits.

To set up this query, follow the basic steps described in the Create a SQL Query example to create a new query based on the PhysicalExam table in the study schema. Name this new query "AverageTempPerParticipant."

If you are working with the LabKey demo study, these queries may be predefined, so you can view and edit them in place, or create new queries with different names.

Within the SQL Source editor, delete the SQL created there by default for this query and paste in the following SQL:

SELECT PhysicalExam.ParticipantID, 
ROUND(AVG(PhysicalExam.temperature_c), 1) AS AverageTemp,
FROM PhysicalExam
GROUP BY PhysicalExam.ParticipantID

For each ParticipantID, this query finds all rows for that ParticipantID and calculates the average temperature for these rows, rounded up to the 10ths digit. In other words, we calculate the participant's average temperature across all visits and store that value in a new column called "AverageTemp."

See similar results in our interactive example.

See also the Efficient Use of "GROUP BY" section below.

JOIN a Calculated Column to Another Query

The JOIN function can be used to combine data in multiple queries. In our example, we can use JOIN to append our newly-calculated, per-participant averages to the PhysicalExam dataset and create a new, combined query.

First, create a new query based on the "Physical Exam" table in the study schema. Call this query "PhysicalExam + AverageTemp" and choose to edit it in the SQL Source Editor. Now add edit the SQL such that it looks as follows.

SELECT PhysicalExam.ParticipantId,
PhysicalExam.SequenceNum,
PhysicalExam.Date,
PhysicalExam.Day,
PhysicalExam.Weight_kg,
PhysicalExam.Temp_C,
PhysicalExam.SystolicBloodPressure,
PhysicalExam.DiastolicBloodPressure,
PhysicalExam.Pulse,
AverageTempPerParticipant.AverageTemp
FROM PhysicalExam
INNER JOIN AverageTempPerParticipant
ON PhysicalExam.ParticipantID=AverageTempPerParticipant.ParticipantID

You have added one line before the FROM clause to add the AverageTemp column from the AverageTempPerParticipant dataset. You have also added one additional line after the FROM clause to explain how data in the AverageTempPerParticipant are mapped to columns in the Physical Exam table. The ParticipantID column is used for mapping between the tables.

See similar results in the interactive example.


Premium Resource Available

Subscribers to premium editions of LabKey Server can learn about displaying a calculated column in the original table following the example in this topic:


Learn more about premium editions

Calculate a Column Using Other Calculated Columns

We next use our calculated columns as the basis for creating yet another calculated column that provides greater insight into our dataset.

This column will be the difference between a participant's temperature at a particular visit and the average temperature for all of his/her visits. This "TempDelta" statistic will let us look at deviations from the mean and identify outlier visits for further investigation.

Steps:

  • Create a new query named "Physical Exam + TempDelta" and base it on the "Physical Exam + AverageTemp" query we just created above.
  • Add the following SQL expression in the Query Designer, following the last column selected. Don't forget to add a comma on that previous line:
ROUND(("Physical Exam + AverageTemp".Temp_C-
"Physical Exam + AverageTemp".AverageTemp), 1) AS TempDelta
  • Provide a longer display name for the new column by pasting this content on the XML Metadata tab.
<tables xmlns="http://labkey.org/data/xml">
<table tableName="Physical Exam + TempDelta" tableDbType="NOT_IN_DB">
<columns>
<column columnName="TempDelta">
<columnTitle>Temperature Difference From Average</columnTitle>
</column>
</columns>
</table>
</tables>

See similar results in the interactive example.

Filter Calculated Column to Make Outliers Stand Out

It can be handy to filter your results such that outlying values stand out. This is simple to do in a LabKey grid using the column header filtering options.

Using the query above ("Physical Exam + TempDelta"), we want to show the visits in which a participant's temperature was unusually high for them, possibly indicating a fever. We filter the calculated "Temperature Difference From Average" column for all values greater than 1.5. Just click on the column header, select Filter. Choose "Is Greater Than" and type "1.5" in the popup, then click OK.

This leaves us with a list of all visits where a participant's temperature was more than 1.5 degrees C above the participant's mean temperature at all his/her visits. Notice the total number of filtered records is displayed above the grid.

See similar results in the Demo Study.

Use "SIMILAR_TO"

String pattern matching can be done using similar_to. The syntax is similar_to(A,B,C): A similar to B escape C. The escape clause is optional.

  • 'A' is the string (or field name) to compare
  • 'B' is the pattern to match against
  • 'C' is the escape character (typically a backslash) used before characters that would otherwise be parsed as statement syntax, including but not limited to "%", "(", ",".
To return all the names on a list that started with AB, you might use:
SELECT Name from MyList
WHERE similar_to (Name, 'AB%')

If you wanted to return names on the list that started with '%B', you would use the following which uses a to escape the first % in the string.:

WHERE similar_to (Name, '\%B%', '\')

Learn more about SQL matching functions here.

Match Unknown Number of Spaces

If you have data where there might be an arbitrary number of spaces between terms that you want to match in a pattern string, use SIMILAR_TO with a space followed by + to match an unknown number of spaces. For example, suppose you wanted to find entries in a list like this, but didn't know how many spaces were between the term and the open parentheses "(".

  • Thing (one)
  • Thing (two - with two spaces)
  • Thing to ignore
  • Thing (three)
Your pattern would need to both include " +" to match one or more spaces and escape the "(":
WHERE similar_to(value, 'Thing +\(%', '\')

Round a Number and Cast to Text

If your query involves both rounding a numeric value (such as to 1 or two digits following the decimal place) and also casting that value to text, the CAST to VARCHAR will "undo" the rounding adding extra digits to your text value. Instead, use to_char. Where 'myNumber' is a numeric value, pass the formatting to use as the second parameter. to_char will also round the value to the specified number of digits:

to_char(myNumber,'99999D0')

The above format example means up to 5 digits (no leading zeros included), a decimal point, then one digit (zero or other). More formats for numeric values are available here.

Efficient Use of "GROUP BY"

In some use cases, it can seem logical to have a long GROUP BY list. This can be problematic when the tables are large, and there may be a better way to write such queries both for performance and for readability. As a general rule, keeping GROUP BY clauses short is best practice for more efficient queries.

As an example, imagine you have customer and order tables. You want to get all of the customer's info plus the date for their most recent order.

You could write something like:

SELECT c.FirstName, c.LastName, c.Address1, c.Address2, c.City, c.State, c.Zip, MAX(o.Date) AS MostRecentOrder
FROM
Customer c LEFT JOIN Order o ON c.Id = o.CustomerId
GROUP BY c.FirstName, c.LastName, c.Address1, c.Address2, c.City, c.State, c.Zip

...but that makes the database do the whole join (which may multiply the total number of rows) and then sort it by all of the individual columns so that it can de-duplicate them again for the GROUP BY.

Instead, you could get the most recent order date via a correlated subquery, like:

SELECT c.FirstName, c.LastName, c.Address1, c.Address2, c.City, c.State, c.Zip, (SELECT MAX(o.Date) FROM Order o WHERE c.Id = o.CustomerId) AS MostRecentOrder
FROM
Customer c

Or, if you want more than one value from the Order table, change the JOIN so that it does the aggregate work:

SELECT c.FirstName, c.LastName, c.Address1, c.Address2, c.City, c.State, c.Zip, o.MostRecentOrder, o.FirstOrder
FROM
Customer c LEFT JOIN (SELECT CustomerId, MAX(Date) AS MostRecentOrder), MIN(Date) AS FirstOrder FROM Order GROUP BY CustomerId) o ON c.Id = o.CustomerId

Related Topics




External Schemas and Data Sources


An externally-defined schema can provide access to tables that are managed on any PostgreSQL database, or with Premium Editions of LabKey Server, any Microsoft SQL Server, SAS, Oracle, or MySQL database server available at your institution.

Site Administrators can make externally-defined schemas accessible within the LabKey interface, limiting access to authorized users and, if desired, a subset of tables within each schema. Once a schema is accessible, externally-defined tables become visible as tables within LabKey and LabKey applications can be built using these tables.

This document explains how to configure external data sources and load schemas from those data sources.

Overview

Externally defined schemas give administrators access to edit external tables within the LabKey interface, if the schema has been marked editable and the table has a primary key. XML meta data can also be added to specify formatting or lookups. Folder-level security is enforced for the display and editing of data contained in external schemas.

You can also pull data from an existing LabKey schema in a different folder by creating a "linked schema". You can choose to expose some or all of the tables from the original schema. The linked tables and queries may be filtered such that only a subset of the rows are shown. For details see Linked Schemas and Tables.

Note: you cannot create joins across data sources, including joins between external and internal schema on LabKey Server. As a work around, use an ETL to copy the data from the external data source(s) into the main internal data source. Once all of the data is in the main data source, you can create joins on the data.

Usage Scenarios

  • Display, analyze, and report on any data stored on any database server within your institution.
  • Build LabKey applications using external data without relocating the data.
  • Create custom queries that join data from standard LabKey tables with user-defined tables in the same database.
  • Publish SAS data sets to LabKey Server, allowing secure, dynamic access to data sets residing in a SAS repository.
Changes to the data are reflected automatically in both directions. Data rows that are added, deleted, or updated from either the LabKey Server interface or through external routes (for example, external tools, scripts, or processes) are automatically reflected in both places. Changes to the table schema are not immediately reflected, see below.)

Avoid: LabKey strongly recommends that you avoid defining the core LabKey Server schemas as external schemas. There should be no reason to use a LabKey schema as an external schema and doing so invites problems during upgrades and can be a source of security issues.

Data Source Configuration

Before you define an external schema in LabKey server, you must first configure a new data source resource in LabKey Server. Typically this is done by editing the labkey.xml configuration file, and in some cases, other steps. See the following topics for the preliminary configuration steps, depending on the type of external data source you are using:

Load an External Schema/Data Source

When an externally defined schema (i.e., a set of data tables) is added to LabKey Server, the tables are surfaced in the Query Schema Browser.

To load an externally-defined schema into LabKey Server:

  • Click on the folder/project where you would like to place the schema.
  • Select (Admin) > Developer Links > Schema Browser.
  • Click Schema Administration.
  • Click New External Schema.

Fill out the following fields:

  • Schema Name – Required.
    • Name of the schema as you want to see it within LabKey Server.
    • When you select the Database Schema Name two items below this, this field will default to match that name, but you can specify an alternate name here if needed.
  • Data Source - JNDI name of the DataSource associated with this schema.
    • This name is typically included as the first parameter of the <Resource> element with a "DataSource" suffix that is not shown in the dropdown here.
    • Data source names must be unique.
    • All external data sources identified in the labkey.xml/ROOT.xml file are listed as options in this drop-down.
  • Database Schema Name – Required. Name of the physical schema within the external database.
    • This dropdown will offer all the schemas accessible to the user account under which the external schema was defined. If you don't see a schema you expect here, check the external database directly using the same credentials as the external data source connection uses.
    • Show System Schemas - Check the box to show system schemas which are filtered out of this dropdown.
  • Editable - Check to allow insert/update/delete operations on the external schema. This option currently only works on MSSQL and Postgres databases, and only for tables with a single primary key.
  • Index Schema Meta Data - Determines whether the schema should be indexed for full-text search.
  • Fast Cache Refresh - Whether or not this external schema is set to refresh its cache often. Intended for use during development.
  • Tables - Allows you to expose or hide selected tables within the schema.
    • Click the to expand the table list.
    • Checked tables are shown in the Query Schema Browser; unchecked tables are hidden.
    • If you don't see a table you expect here, check the external database directly using the same credentials as the external data source connection uses.
  • Meta Data – You can use a specialized XML format to specify how columns are displayed in LabKey.
    • For example you can specify data formats, column titles, and URL links. This field accepts instance documents of the TableInfo XML Schema XML schema.
When you are finished, click the Create button at the bottom of the form.

Edit a Previously Defined External Schema

The Schema Administration page displays the external and linked schemas that have been defined in the current folder, where you can view, edit, reload, or delete them.

  • To navigate to the Schema Administration page, go to the Query Schema Browser for the current folder, and click the Schema Administration button.

Reload an External Schema

External schema metadata is not automatically reloaded. It is cached within LabKey Server for an hour, meaning changes, such as to the number of tables or columns, are not immediately reflected. If you make changes to external schema metadata, you may explicitly reload your external schema immediately using the reload link on the Schema Administration page.

Manage External Schemas Site-Wide

Find a comprehensive list of all external datasources and external schemas on the admin console:

  • Select (Admin) > Site > Admin Console.
  • Under Diagnostics, click Data Sources.
You'll see links to all containers (beginning with /) followed by all the external schemas defined therein. Click to directly access each external schema definition.

Configure for Connection Validation

If there is a network failure or if a database server is restarted, the connection to the data source is broken and must be reestablished. Tomcat can be configured to test each connection and attempt reconnection by specifying a simple validation query. If a broken connection is found, Tomcat will attempt to create a new one. The validation query is specified in your DataSource resource in labkey.xml.

For a PostgreSQL or Microsoft SQL Server datasource, add this parameter:

validationQuery="SELECT 1"

For a SAS data source, add this parameter:

validationQuery="SELECT 1 FROM sashelp.table"

For a MySQL data source, add this parameter:

validationQuery="/* ping */"

For an Oracle data source, add this parameter:

validationQuery="SELECT 1 FROM dual"

Troubleshoot External Schemas

If you encounter problems defining and using external schema, start by checking the following:

  • Does the <Resource> tag for the external schema in your labkey.xml conform to the database-specific templates we provide. Over time both expected parameters and URL formats will change.
    • Check that the version of the jdbc driver for your external database is of a compatible version, and compare <Resource> tag parameters to driver expectations. Learn more in the database-specific pages
    • Check both the current documentation and the {archive for the specific version of LabKey you are using|/Documentation/Archive/project-begin.view?}.
    • Compare the resource tag carefully to the template provided, including the syntax of a connection validation query if present.
  • Is the name in the <Resource> tag unique, and does it exactly match what you are entering when you define the external schema?
  • Is your external database of a supported version? The database specific pages provide additional information.
  • If you don't see the expected tables when defining a new external schema:
    • Try directly accessing the external database with the credentials provided in the <Resource> tag to confirm access to the "Database Schema Name" you specified to see if you see the expected tables.
    • Check whether a schema of that name already exists on your server. Check the box to show hidden schemas to confirm.
    • If you can directly access the LabKey database, check whether there is already an entry in the query.externalschema table. This table stores these definitions but is not surfaced within the LabKey schema browser.
    • If there is an unexpected duplicate external schema, you will not be able to add a new one. One possible cause could be a "broken view" on that schema. Learn more in this topic.
  • Check the raw schema metadata for the external schema by going to the container where it is defined and accessing the URL of this pattern (substitute your SCHEMA_NAME):
    /query-rawSchemaMetaData.view?schemaName=SCHEMA_NAME
  • Try enabling additional logging. Follow the instructions in this topic to temporarily change the logging level from "INFO" to "DEBUG", refresh the cache, and then retry the definition of the external schema to obtain debugging information.
    • Loggers
    • Loggers of interest:
      • org.labkey.api.data.SchemaTableInfoCache - for loading of tables from an external schema
      • org.labkey.api.data.ConnectionWrapper - for JDBC metadata, including attempts to query external databases for tables
      • org.labkey.audit.event.ContainerAuditEvent - at the INFO level, will report creation of external schema

Related Topics




External MySQL Data Sources


Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic explains how to configure a MySQL database as an external data source. We strongly recommend using MySQL 5.6 or higher; the MySQL JDBC driver that ships with LabKey Server (Connector/J 8.0) claims compatibility "with all MySQL versions starting with MySQL 5.6", so older server versions may not work correctly.

Configure the MySQL Data Source

Add a <Resource> element, to your installation's labkey.xml configuration file. Use the configuration template below as a starting point. Replace USERNAME and PASSWORD with the correct credentials. If you are running LabKey Server against a remote installation of MySQL, change the url attribute to point to the remote server.

If you will be connecting to more than one data source, be sure that all have unique name parameters, i.e. mySqlDataSource1, mySqlDataSource2, or similar. This value will be used whenever you connect an external schema to this datasource so it must be unambiguous.

    <Resource name="jdbc/mySqlDataSource" auth="Container"
type="javax.sql.DataSource"
username="USERNAME"
password="PASSWORD"
driverClassName="com.mysql.cj.jdbc.Driver"
url="jdbc:mysql://localhost:3306/?autoReconnect=true&amp;useUnicode=true&amp;characterEncoding=utf8&amp;zeroDateTimeBehavior=CONVERT_TO_NULL"
maxTotal="15"
maxIdle="7"
accessToUnderlyingConnectionAllowed="true"
validationQuery="/* ping */"
/>

Note: the "zeroDataTimeBehavior=CONVERT_TO_NULL" parameter on the url above converts MySQL's special representation of invalid dates ("00-00-0000") into null. See the MySQL documentation for other options.

Define a New Schema

Now define a new schema from the MySQL data source. For details see Set Up an External Schema.

Related Topics




External Oracle Data Sources


Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic explains how to configure LabKey server to retrieve and display data from an Oracle database as an external data source.

Oracle JDBC Driver

LabKey Server requires the Oracle JDBC driver to connect to Oracle databases. The driver is included with all Premium Edition distributions.

If you previously downloaded an ojdbc#.jar file, you must delete it from the CATALINA_HOME/lib directory to avoid a version conflict.

Configure the Oracle Data Source

Add a <Resource> element to your installation's labkey.xml configuration file. Use the template below as a general starting point, replacing the words in capitals with their appropriate values.

If you will be connecting to more than one data source, be sure that all have unique name parameters, i.e. oracleDataSource1, oracleDataSource2, or similar. This value will be used whenever you connect an external schema to this datasource so it must be unambiguous.

<Resource name="jdbc/oracleDataSource" 
auth="Container"
type="javax.sql.DataSource"
driverClassName="oracle.jdbc.driver.OracleDriver"
url="jdbc:oracle:thin:@SERVER:PORT:SID"
username="USERNAME"
password="PASSWORD"
maxTotal="8"
maxIdle="4"
accessToUnderlyingConnectionAllowed="true"
/>

Note: The username and password have been included in the connection URL for debugging purposes. You can remove the USERNAME/PASSWORD portion from the URL and include them in their own fields, but if you do, any debug information will only have the SID name and not the actual schema name. Refer to Oracle FAQs: JDBC for other Oracle JDBC URL syntax.

Define a New Schema

Now define a new schema from the Oracle data source. For details see Set Up an External Schema

Troubleshooting

Connection Validation Query

If desired, you can add a parameter to the resource tag to perform connection validation. For Oracle, the syntax is:

validationQuery="SELECT 1 FROM dual"

The error raised in case of a syntax error in this statement might be similar to:

Caused by: java.sql.SQLSyntaxErrorException: ORA-00923: FROM keyword not found where expected

Recovery File Size

Below are steps to take if you see an error on the Oracle console (or see an error in Event Viewer -> Windows Logs -> Application) that says one of the following:
  • "ORA-19809: limit exceeded for recovery files"
  • "ORA-03113: end-of-file on communication channel"
  • "ORA-27101: shared memory realm does not exist"
Try:
  1. Update the RMAN value for "db_recovery_file_dest_size" to be a bit higher (i.e. from 16G to 20G).
  2. Restart your VM/machine.
  3. Restart start Oracle again.

Number (Decimal/Double) Data Truncates to Integer

If you have a column of type NUMBER and its value is being shown as an integer instead of a decimal, the source of the problem may be a bug in the Oracle JDBC driver. More information can be found here:

Try:
  1. Set the following Java system property in the arguments used to launch the JVM:
    -Doracle.jdbc.J2EE13Compliant=true

Open Cursor Limits

Oracle's thin JDBC driver has a long-standing bug that causes it to leak cursors. This presents a problem because LabKey uses a connection pooling approach.

LabKey has worked around this problem by estimating the number of cursors that are still available for the connection and discarding it when it approaches the limit. To accomplish this, the server issues the following query to Oracle:

SELECT VALUE FROM V$PARAMETER WHERE Name = 'open_cursors'

If the query cannot be executed, typically because the account doesn't have permission, LabKey will log a warning indicating it was unable to determine the max open cursors, and assume that Oracle is configured to use its default limit, 50. If Oracle's limit is set to 50 or higher, the warning can be safely ignored.

To avoid the warning message, grant the account permission to run the query against V$PARAMETER.

Related Documents




External Microsoft SQL Server Data Sources


Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic explains how to configure LabKey Server to retrieve and display data from Microsoft SQL Server as an external data source.

Microsoft SQL Server can also be used as LabKey Server's primary data source.

For either use case, check to confirm that your version of SQL Server is compatible with your version of LabKey Server in this topic:

The Microsoft SQL Server JDBC Driver

We recommend using Microsoft's JDBC driver, which is included with LabKey Server Premium Editions as part of the BigIron module. Support for the jTDS driver is also included, but will be removed in a future release.

This topic provides configuration information for using the Microsoft JDBC driver. You can find configuration information for using the jTDS driver in the documentation archives.

Configure the MS SQL Server Data Source

Add a <Resource> element to your installations labkey.xml configuration file. Use the template below as a general starting point, replacing the words in capitals with their appropriate values.

If you will be connecting to more than one data source, be sure that all have unique name parameters, i.e. mssqlDataSource1, mssqlDataSource2, or similar. This value will be used whenever you connect an external schema to this datasource so it must be unambiguous.

<Resource name="jdbc/mssqlDataSource" 
auth="Container"
type="javax.sql.DataSource"
username="USERNAME"
password="PASSWORD"
driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver"
url="jdbc:sqlserver://SERVER_NAME:1433;databaseName=DATABASE_NAME;trustServerCertificate=true;applicationName=LabKey Server;"
maxTotal="20"
maxIdle="10"
maxWaitMillis="120000"
accessToUnderlyingConnectionAllowed="true"
validationQuery="SELECT 1"/>

Notes:

  • In the url parameter, "trustServerCertificate=true" is needed if SQL Server is using a self-signed cert that Java hasn't been configured to trust. The applicationName isn't required but identifies the client to SQL Server to show in connection usage and other reports.
  • If the port is not 1433 (the default), then remember to edit this part of the url parameter as well.
  • The maxWaitMillis parameter is provided to prevent server deadlocks. Waiting threads will time out when no connections are available rather than hang the server indefinitely.

Define a New Schema

Now define a new schema from the SQL Server data source. For details see Set Up an External Schema

Related Topics




External PostgreSQL Data Sources


This topic explains how to configure a PostgreSQL database as an external data source.

Configure the PostgreSQL Data Source

Add a <Resource> element, to your installation's labkey.xml configuration file. Use the configuration template below as a starting point. Replace USERNAME, PASSWORD, and DBNAME with the correct values for your environment. If you are running LabKey Server against a remote installation of PostgreSQL, change the url attribute to point to the remote server.

If you will be connecting to more than one data source, be sure that all have unique name parameters, i.e. pgDataSource1, pgDataSource2, or similar. This value will be used whenever you connect an external schema to this datasource so it must be unambiguous.

   <Resource name="jdbc/pgDataSource" auth="Container"
type="javax.sql.DataSource"
username="USERNAME"
password="PASSWORD"
driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://localhost:5432/DBNAME"
maxTotal="20"
maxIdle="10"
accessToUnderlyingConnectionAllowed="true"
/>

Define a New Schema

To define a new schema from the PostgreSQL data source see Set Up an External Schema.

Related Topics

PostgreSQL JDBC Driver




External SAS Data Sources


Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic explains how to configure a SAS/SHARE data repository as an external data source.

Configure SAS/SHARE Data Source

1. Add a line to the file named "services" (For Windows, check in c:\windows\system32\drivers\etc; for Linux and OSX, check in /etc/services) for SAS/SHARE; for example:

sasshare    5010/tcp    #SAS/SHARE server

2. Run SAS

3. Execute a script that specifies one or more libnames and starts the SAS/SHARE server. For example:

     libname airline 'C:\Program Files\SAS\SAS 9.1\reporter\demodata\airline';
proc server authenticate=optional id=sasshare; run;

4. Add a new SAS DataSource element to your labkey.xml file, for example:

<Resource name="jdbc/sasDataSource" 
auth="Container"
type="javax.sql.DataSource"
driverClassName="com.sas.net.sharenet.ShareNetDriver"
url="jdbc:sharenet://localhost:5010?appname=LabKey"
maxTotal="8"
maxIdle="4"
accessToUnderlyingConnectionAllowed="true"
/>

Note that if you will be connecting to more than one external data source, be sure that all have unique name parameters, i.e. sasDataSource1, sasDataSource2, or similar. This value will be used whenever you connect an external schema to this datasource so it must be unambiguous.

5. Copy the JDBC driver jars sas.core.jar and sas.intrnet.javatools.jar to your tomcat/lib directory.

Note: We recommend using the latest version of the SAS Driver 9.2 for JDBC; this driver works against both SAS 9.1 and 9.2. At this time, the most recent 9.2 version is called 9.22-m2; the driver lists a version number of 9.2.902200.2.0.20090722190000_v920m2. Earlier versions of the 9.2 driver are not recommended because they had major problems with column data types.

6. Start LabKey Server.

7. Visit the Query Schema Browser in the folder where you wish to expose your SAS library.

8. Define a new External Schema. Choose your SAS data source, pick a library, and specify a schema name to use within this folder (this name can be different from the SAS library name).

The data sets in the library are now available as queries in this folder. You can browse them via the Schema Browser, configure them in a query web part, create custom queries using them, etc.

For more information on configuring the SAS JDBC Driver, see Introduction to the SAS Drivers 9.2 for JDBC

Note: This procedure will run SAS/SHARE in a completely open, unsecured manner. It is intended for development purposes only. Production installations should put appropriate authentication in place.

Related Documentation




External Redshift Data Sources


Premium Feature — Available in all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic explains how to configure an Amazon Redshift database as an external data source.

The Redshift Driver

The Redshift JDBC driver is bundled with the LabKey Redshift module.

Driver Note: If you are upgrading from any version prior to 21.3 (March 2021), you must delete any Redshift drivers and dependencies previously installed in the CATALINA_HOME/lib directory to avoid conflicts with the bundled driver. LabKey Server will fail to start if it detects an old Redshift driver in CATALINA_HOME/lib.

Tomcat Configuration

Add a <Resource> element, to your installation's labkey.xml configuration file. Use the configuration template below as a starting point.

If you will be connecting to more than one data source, be sure that all have unique name parameters, i.e. redshiftDataSource1, redshiftDataSource2, or similar, where you see DATASOURCE_NAME in the below. This value will be used whenever you connect an external schema to this datasource so it must be unambiguous. Replace USERNAME and PASSWORD with the correct credentials, and complete the url parameter with your own URL, PORT, DATABASE_NAME.

<Resource name="jdbc/DATASOURCE_NAME" auth="Container"
type="javax.sql.DataSource"
username="USERNAME"
password="PASSWORD"
driverClassName="com.amazon.redshift.jdbc42.Driver"
url="jdbc:redshift://$URL:$PORT/$DATABASE_NAME"
maxTotal="20"
maxIdle="10"
accessToUnderlyingConnectionAllowed="true"
validationQuery="SELECT 1"/>

Define a New Schema

Now define a new schema from the Redshift data source. For details see Set Up an External Schema.

Supported Functionality

Most queries that would work against a PostgreSQL data source will also work against a Redshift data source.

Of the known differences most are due to limitations of Redshift, not the LabKey SQL dialect, including:

  • GROUP_CONCAT is not supported (Redshift does not support arrays).
  • Recursive CTEs are not supported; but non-recursive CTEs are supported.
  • Most PostgreSQL pass-through methods that LabKey SQL supports will work, but any involving binary types, such as string encode(), will not work.

Related Topics




Linked Schemas and Tables


Linked schemas allow you to access data from another folder, by linking to a schema in that folder. Linked schemas are useful for providing read access to selected data in another folder which is otherwise not allowed by that folder's overall permissions. Linked schemas provide a way to apply security settings at a finer granularity than at the level of whole folders.

Usage Scenario

Linked schemas are especially useful when you want to reveal some data in a folder without granting access to the whole folder. For example, suppose you have the following data in a single folder A. Some data is intended to be private, some is intended for the general public, and some is intended for specific audiences:

  • Private data to be shown only to members of your team.
  • Client data and tailored views to be shown to individual clients.
  • Public data to be shown on a portal page.
You want to reveal each part of the data as is appropriate for each audience, but you don't want to give out access to folder A. To do this, you can create a linked schema in another folder B that exposes selected items in folder A. The linked schema may expose some or all of the tables and queries from the original schema. Furthermore, the linked tables and queries may be additionally filtered to create more refined views, tailored for specific audiences.

Security Considerations

The nature of a linked schema is to provide access to tables and queries in another folder which likely has different permissions than the current folder.

Permissions: A linked schema always grants the standard "Reader" role in the target folder to every user who can access the schema. Site Administrators who configure a linked schema must be extremely careful, since they are effectively bypassing folder and dataset security for the tables/queries that are linked. In addition, since the "Reader" role is granted in only the target folder itself, a linked schema can't be used to execute a "cross-container" query, a custom query that references other folders. If multiple containers must be accessed then multiple linked schemas will need to be configured or an alternative mechanism such as an ETL will need to be employed. Linked schemas also can't be used to access data that is protected with PHI annotations.

Lookup behavior: Lookups are removed from the source tables and queries when they are exposed in the linked schema. This prevents traversing a table in a linked schema beyond what has been explicitly allowed by the linked schema creator.

URLs: URLs are also removed from the source tables. The insert, update, and delete URLs are removed because the linked schema is considered read-only. The details URL and URL properties on columns are removed because the URL would rarely work in the linked schema container. If desired, the lookups and URLs can be added back in the linked schema metadata xml. To carry over any attachment field links in the source table, first copy the metadata that enables the links in the source table and paste it into the analogous field in the linked schema table. See below for an example.

Create a Linked Schema

A user must have the "Site Administrator" role to create a linked schema.

To create a linked schema in folder B that reveals data from folder A:

  • Navigate to folder B.
  • Select (Admin) > Developer Links > Schema Browser.
  • Click Schema Administration.
  • Under Linked Schemas, click New Linked Schema and specify the schema properties:
    • Schema Name: Provide a name for the new schema.
    • Source Container: Select the source folder that holds the originating schema (folder A).
    • Schema Template: Select a named schema template in a module. (Optional.)
    • Source Schema: Select the name of the originating schema in folder A.
    • Published Tables: To link/publish all of the tables and queries, make no selection. To link/publish a subset of tables, use checkboxes in the multi-select dropdown.
    • Metadata: Provide metadata filters for additional refinement. (Optional.)

Metadata Filters

You can add metadata xml that filters the data or modifies how it is displayed on the page.

In the following example, a filter is applied to the table Location such that a record is shown only when InUse is true.

<tables xmlns="http://labkey.org/data/xml" xmlns:cv="http://labkey.org/data/xml/queryCustomView">
<filters name="public-filter">
<cv:filter column="InUse" operator="eq" value="true"/>
</filters>
<table tableName="Location" tableDbType="NOT_IN_DB">
<filters ref="public-filter"/>
</table>
</tables>

Handling Attachment Fields

Attachment fields in the source table are not automatically carried over into the target schema, but you can activate attachment fields by providing a metadata override. For example, the XML below activates an attachment field called "AttachedDoc" in the list called "SourceList", which is in the domain/folder called "/labkey/SourceFolder". The URL must contain the listId (shown here = 1), the entityId, and the name of the file.

<tables xmlns="http://labkey.org/data/xml">
<table tableName="SourceList" tableDbType="NOT_IN_DB">
<columns>
<column columnName="AttachedDoc">
<url>/labkey/SourceFolder/list-download.view?listId=1&amp;entityId=${EntityId}&amp;name=${AttachedDoc}</url>
</column>
</columns>
</table>
</tables>

The entityId is a hidden field that you can expose on the original list. To see it, open (Grid Views) > Customize Grid, then check the box for "Show Hidden Fields" and scroll to locate the checkbox for adding "Entity Id" to your grid. Click View Grid to see it. To include it in the URL carried to the target schema, use the syntax "${EntityId} as shown above.

For more information about metadata xml, see Query Metadata.

Schema Template

Default values can be saved as a "schema template" -- by overriding parts of the template, you can change:

  • the source schema (for example, while keeping the tables and metadata the same).
  • the metadata (for example, to set up different filters for each client).
Set up a template by placing .template.xml file in the resources/schemas directory of a module:

myModuleA/resources/schemas/ClientA.template.xml

The example .template.xml file below provides a default linked schema and a default filter xml for Client A:

ClientA.template.xml

<templateSchema xmlns="http://labkey.org/data/xml/externalSchema"
xmlns:dat="http://labkey.org/data/xml"
xmlns:cv="http://labkey.org/data/xml/queryCustomView"
sourceSchemaName="assay.General.Custom Assay">
<tables>
<tableName>Data</tableName>
</tables>
<metadata>
<dat:tables>
<dat:filters name="client-filter">
<cv:filter column="Client" operator="eq" value="A Client"/>
</dat:filters>
<dat:table tableName="Data" tableDbType="NOT_IN_DB">
<dat:filters ref="client-filter"/>
</dat:table>
</dat:tables>
</metadata>
</templateSchema>

To use the module, you must enable it in the source folder (folder A):
  • Go to the source folder and select (Admin) > Folder > Management.
  • Select the Folder Type tab.
  • Under Modules, check the box next to your module.
  • Click Update Folder.

You can override any of the template values by clicking Override template value for the appropriate region.

For example, you can create a schema for Client B by (1) creating a new linked schema based on the template for Client A and (2) overriding the metadata xml, as shown below:

Once you've clicked override, the link will change to read Revert to template value giving you that option.

Troubleshooting Linked Schemas

Linked schemas will be evaluated with the permissions of the user who originally created them. If that user is later deleted from the system, or their permissions change, the schema may no longer work correctly. Symptoms may be errors in queries, the disappearance of the linked schema from the schema browser (while it still appears on the schema administration page) and other issues.

There are two ways to fix this:

  • Recreate the linked schema with the same name as a validated user. An easy way to do this is to rename the 'broken' schema, then create a new one with the original name, using the 'broken' one as a guide.
  • Directly access the database and find the query.ExternalSchema table row for this specific linked schema. Replace the 'CreatedBy' user value from the deleted user to a validated user. Clear Caches and Refresh after making this change.

Related Topics




Controlling Data Scope


Data within LabKey is ordinarily contained and restricted for use in a specific folder. In cases when you want to provide wider data access across a site, this topic summarizes various options for making data more broadly available outside of a specific folder.

Queries

SQL queries have access to every container, allowing you to pull in data from anywhere on the server. For details see Queries Across Folders.

Queries can also be made available in child folders: see Edit Query Properties.

Custom Grids based on queries can also be made available in child folders: see Customize Grid Views.

Folder Filters can expand the scope of a query to range over different containers: see Query Scope: Filter by Folder.

Linked Schemas

Linked schemas let you pull in data from an existing LabKey schema in a different folder. The linked schema may expose some or all of the tables and queries from the source schema. The linked tables and queries may be filtered such that only a subset of the rows and/or columns are available. Linked schemas are unique in that they can give users access to data they would not otherwise have permission to see. All the other approaches described here require the user to have permission to the container in which the data really lives. For details see Linked Schemas and Tables.

Custom Grids based on linked schema queries can also be made available in child folders: see Customize Grid Views.

Lookups and Reports

Lookup columns let you pull in data from one table into another. For example, setting up a lookup to a list or dataset as the common source for shared content, can simplify data entry and improve consistency. Lookups can target any container on the site. See Lookup Columns.

R Reports can be made available in child folders. For details see: Saved R Reports.

Studies

When you create a published or ancillary study, LabKey Server creates a new folder (a child folder of the original study folder) and copies snapshots of selected data into it, assembling the data in the form of a new study.

Publishing a study is primarily intended for presenting your results to a new audience. For details see Publish a Study.

Ancillary studies are primarily intended for new investigations of the original study's data. For details see Ancillary Studies

Sharing datasets and timepoints across multiple studies is an experimental feature. See Shared Datasets and Timepoints.

Assay Data

To give an assay design project-wide scope, create the design at the project level instead of specifying the subfolder. Project-level assay designs are available in every folder and subfolder of the project.

For site-wide access, create the assay design in the Shared project.

Assay data can also be made available in child folders: see Customize Grid Views.

Assay runs and results can be linked to any study folder on the site (to which the linking user has permission). See Link Assay Data into a Study.

Sample Types

Create a Sample Type in the Shared project to have access to it site-wide, or create it in the project for access in that project's subfolders. See Samples.

Sample data can be linked to any study folder on the site (to which the linking user has permission). See Link Sample Data to Study.

Collaboration Tools

Wiki pages defined in a given container can be reused and displayed in any other container on the server. For details see: Wiki Admin Guide.

If you create an Issue Tracker definition in the Shared project, it will be available site wide. Created in the project, it will be available in all folders in the project.

Files

Folders store and access files on the file system via the file root. The default root is specific to the container, but if desired, multiple containers can be configured to use the same set of files from the file system using a custom pipeline root. See File Root Options.

Using a Premium Edition of LabKey Server you can also configure one or more file watchers to monitor such a shared location for data files of interest.

Security

Scoping access to data through security groups and permissions is another way to control availability to your data. Set up site or project groups for your users, and choosing when to use inheritance of permissions in subfolders gives you many options for managing access.

You can also expose only a selected subset of your data broadly, while protecting PHI for example, using row and column level permissions as outlined here:

Related Topics




Ontology Integration


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Ontologies help research scientists in many specialties reconcile data with common and controlled vocabularies. Aligning terms, hierarchies, and the meaning of specific data columns and values is important for consistent analysis, reporting, and cross-organization integration. An ontology system will standardize concepts, understand their hierarchies, and align synonyms which describe the same entities. You can think of an ontology like a language; when your data is all speaking the same language, greater meaning emerges.

Ontology Module

The ontology module enables reporting and using controlled vocabularies. Many such vocabularies are in active use in different research communities. Some examples:

  • NCIT (NCI Thesaurus): A reference terminology and biomedical ontology
  • LOINC (Logical Observation Identifiers Names and Codes): Codes for each test, measurement, or observation that has a clinically different meaning
  • SNOMED_CT (Systemized Nomenclature of Medicine - Clinical Terms): A standardized multilingual vocabulary of clinical terminology used for electronic exchange of health information
  • MSH (Medical Subject Headings): Used for indexing, cataloging, and searching for health related information
  • Find more examples in the National Library of Medicine metathesaurus

Generate Ontology Archive

The first step is to generate an ontology archive that can be loaded into LabKey. A set of python scripts is provided on GitHub to help you accomplish this.

The python script is designed to accept OWL, NCI, and GO files, but support is not universal for all files of these types. Contact your Account Manager to get started and for help with individual files.

Once generated, the archive will contain individual text files:

  • concepts.txt: (Required) The preferred terms and their codes for this ontology. The standard vocabulary.
  • hierarchy.txt: (Recommended) The hierarchy among these standard terms, expressed in levels and paths to the codes. This can be used to group related items.
  • synonyms.txt: An expected set of local or reported terms that might be used in addition to the preferred term, mapping them to the code so that they are treated as the same concept.

Load and Use Ontologies

To learn about loading ontologies onto your LabKey Server and enabling their use in folders, see this topic: Load Ontologies

Once an ontology has been loaded and the module enabled in your folder, the field editor will include new options and you can use ontology information in your SQL queries.

Concept Annotations

Concept annotations let you link columns in your data to concepts in your ontology. The field editor will include an option to associate the column with a specific concept. For example, a "medication" column might map to a "NCIT:ST1000016" concept code used in a centralized invoicing system.

Learn more in this topic: Concept Annotations

Ontology Lookup Fields

The field editor also includes an Ontology Lookup data type. This field type encompasses three related fields, helping you take an input value and look up both the preferred term and the code used in the ontology.

Learn more in this topic: Ontology Lookup

Ontologies in SQL

New LabKey SQL syntax allows you to incorporate ontology hierarchy and synonyms into your queries.

Learn more in this topic: Ontology SQL

Related Topics




Load Ontologies


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

This topic covers how to load ontology vocabularies onto your server and enable their use in individual folders. It assumes you have obtained or generated an ontology archive in the expected format. You also must have the ontology module deployed on your server.

Load Ontologies

One or more ontology vocabularies can be loaded onto your server. Ontologies are stored in the "Shared" folder where they are accessible site-wide. You may load ontologies from any location on the server.

  • Select (Admin) > Go To Module > More Modules > Ontology.
  • You will see any ontologies already loaded.
  • Click Add LabKey Archive (.Zip) to add a new one.
  • Enter:
    • Name: (Required)
    • Abbreviation: (Required) This should be a short unique string used to identify this ontology. This value can't be changed later.
    • Description: (Optional)
  • Click Create.
  • On the next page, use Browse or Choose File to locate your archive.
    • Ontology zip archives include the files: concepts.txt, hierarchy.txt, synonyms.txt
  • Click Upload.

You will see the pipeline task status as the ontology is loaded. Depending on the size of the archive, this could take considerable time, and will continue in the background if you navigate away from this page.

Once the upload is complete, return to (Admin) > Go To Module > Ontology (you may need to click "More Modules" to find it).

  • Note that if you access the Ontology module from a folder other than /Shared, you will see "defined in folder /SHARED" instead of the manage links shown below. Click /Shared in that message to go to the manage UI described in the next section.

Manage Ontologies

In the Ontologies module in the /Shared project, you will see all available ontologies on the list.

  • Click Browse to see the concepts loaded for this ontology. See Concept Annotations.
  • Click Re-Import to upload an archive for the ontology in that row.
  • Click Delete to remove this ontology.
  • Click Browse Concepts to see the concepts loaded for any ontology.
  • Click Add LabKey Archive (.Zip) to add another.
Learn about browsing the concepts in your ontologies in this topic: Concept Annotations

Enable Ontologies in Folders

To use the controlled vocabularies in your data, enable the Ontology module in the folders where you want to be able to use them.

  • Navigate to the container where you want to use the ontology.
  • Select (Admin) > Folder > Management and click the Folder Type tab.
  • Check the box to enable the Ontology module.
  • Click Update Folder.

Related Topics




Concept Annotations


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Once ontologies have been loaded and enabled in your folder, you can use Concept Annotations to link fields in your data with their concepts in the ontology vocabulary. A "concept picker" interface makes it easy for users to find desired annotations.

Browse Concepts

Reach the grid of ontologies available by selecting (Admin) > Go To Module > More Modules > Ontology.

  • Click Browse Concepts below the grid to see the concepts, codes, and synonyms loaded for any ontology.
  • On the next page, select the ontology to browse.
    • Note that you can shortcut this step by viewing ontologies in the "Shared" project, then clicking Browse for a specific row in the grid.
  • Type into the search bar to immediately locate terms. See details below.
  • Scroll to find terms of interest, click to expand them.
  • Details about the selected item on the left are shown to the right.
    • The Code is in a shaded box, including the ontology prefix.
    • Any Synonyms will be listed below.
  • Click Show Path or the Path Information tab to see the hierarchy of concepts that lead to the selection. See details below