Table of Contents

guest
2025-05-28
     Ontology Integration
       Load Ontologies
       Concept Annotations
       Ontology Column Filtering
       Ontology Lookup
       Ontology SQL

Ontology Integration


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Ontologies help research scientists in many specialties reconcile data with common and controlled vocabularies. Aligning terms, hierarchies, and the meaning of specific data columns and values is important for consistent analysis, reporting, and cross-organization integration. An ontology system will standardize concepts, understand their hierarchies, and align synonyms which describe the same entities. You can think of an ontology like a language; when your data is all speaking the same language, greater meaning emerges.

Ontology Module

The ontology module enables reporting and using controlled vocabularies. Many such vocabularies are in active use in different research communities. Some examples:

  • NCIT (NCI Thesaurus): A reference terminology and biomedical ontology
  • LOINC (Logical Observation Identifiers Names and Codes): Codes for each test, measurement, or observation that has a clinically different meaning
  • SNOMED_CT (Systemized Nomenclature of Medicine - Clinical Terms): A standardized multilingual vocabulary of clinical terminology used for electronic exchange of health information
  • MSH (Medical Subject Headings): Used for indexing, cataloging, and searching for health related information
  • Find more examples in the National Library of Medicine metathesaurus

Generate Ontology Archive

The first step is to generate an ontology archive that can be loaded into LabKey. A set of python scripts is provided on GitHub to help you accomplish this.

The python script is designed to accept OWL, NCI, and GO files, but support is not universal for all files of these types. Contact your Account Manager to get started and for help with individual files.

Once generated, the archive will contain individual text files:

  • concepts.txt: (Required) The preferred terms and their codes for this ontology. The standard vocabulary.
  • hierarchy.txt: (Recommended) The hierarchy among these standard terms, expressed in levels and paths to the codes. This can be used to group related items.
  • synonyms.txt: An expected set of local or reported terms that might be used in addition to the preferred term, mapping them to the code so that they are treated as the same concept.

Load and Use Ontologies

To learn about loading ontologies onto your LabKey Server and enabling their use in folders, see this topic: Load Ontologies

Once an ontology has been loaded and the module enabled in your folder, the field editor will include new options and you can use ontology information in your SQL queries.

Concept Annotations

Concept annotations let you link columns in your data to concepts in your ontology. The field editor will include an option to associate the column with a specific concept. For example, a "medication" column might map to a "NCIT:ST1000016" concept code used in a centralized invoicing system.

Learn more in this topic: Concept Annotations

Ontology Lookup Fields

The field editor also includes an Ontology Lookup data type. This field type encompasses three related fields, helping you take an input value and look up both the preferred term and the code used in the ontology.

Learn more in this topic: Ontology Lookup

Ontologies in SQL

New LabKey SQL syntax allows you to incorporate ontology hierarchy and synonyms into your queries.

Learn more in this topic: Ontology SQL

Related Topics




Load Ontologies


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

This topic covers how to load ontology vocabularies onto your server and enable their use in individual folders. It assumes you have obtained or generated an ontology archive in the expected format. You also must have the ontology module deployed on your server.

Load Ontologies

One or more ontology vocabularies can be loaded onto your server. Ontologies are stored in the "Shared" folder where they are accessible site-wide. You may load ontologies from any location on the server.

  • Select > Go To Module > More Modules > Ontology.
  • You will see any ontologies already loaded.
  • Click Add LabKey Archive (.Zip) to add a new one.
  • Enter:
    • Name: (Required)
    • Abbreviation: (Required) This should be a short unique string used to identify this ontology. This value can't be changed later.
    • Description: (Optional)
  • Click Create.
  • On the next page, use Browse or Choose File to locate your archive.
    • Ontology zip archives include the files: concepts.txt, hierarchy.txt, synonyms.txt
  • Click Upload.

You will see the pipeline task status as the ontology is loaded. Depending on the size of the archive, this could take considerable time, and will continue in the background if you navigate away from this page.

Once the upload is complete, return to > Go To Module > Ontology (you may need to click "More Modules" to find it).

  • Note that if you access the Ontology module from a folder other than /Shared, you will see "defined in folder /SHARED" instead of the manage links shown below. Click /Shared in that message to go to the manage UI described in the next section.

Manage Ontologies

In the Ontologies module in the /Shared project, you will see all available ontologies on the list.

  • Click Browse to see the concepts loaded for this ontology. See Concept Annotations.
  • Click Re-Import to upload an archive for the ontology in that row.
  • Click Delete to remove this ontology.
  • Click Browse Concepts to see the concepts loaded for any ontology.
  • Click Add LabKey Archive (.Zip) to add another.
Learn about browsing the concepts in your ontologies in this topic: Concept Annotations

Enable Ontologies in Folders

To use the controlled vocabularies in your data, enable the Ontology module in the folders where you want to be able to use them.

  • Navigate to the container where you want to use the ontology.
  • Select > Folder > Management and click the Folder Type tab.
  • Check the box to enable the Ontology module.
  • Click Update Folder.

Related Topics




Concept Annotations


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Once ontologies have been loaded and enabled in your folder, you can use Concept Annotations to link fields in your data with their concepts in the ontology vocabulary. A "concept picker" interface makes it easy for users to find desired annotations.

Browse Concepts

Reach the grid of ontologies available by selecting > Go To Module > More Modules > Ontology.

  • Click Browse Concepts below the grid to see the concepts, codes, and synonyms loaded for any ontology.
  • On the next page, select the ontology to browse.
    • Note that you can shortcut this step by viewing ontologies in the "Shared" project, then clicking Browse for a specific row in the grid.
  • Type into the search bar to immediately locate terms. See details below.
  • Scroll to find terms of interest, click to expand them.
  • Details about the selected item on the left are shown to the right.
    • The Code is in a shaded box, including the ontology prefix.
    • Any Synonyms will be listed below.
  • Click Show Path or the Path Information tab to see the hierarchy of concepts that lead to the selection. See details below

Search Concepts

Instead of manually scrolling and expanding the ontology hierarchy, you can type into the search box to immediately locate and jump to concepts containing that term. The search is specific to the current ontology; you will not see results from other ontologies.

As soon as you have typed a term of at least three characters, the search results will populate in a clickable dropdown. Only full word matches are included. You'll see both concepts and their codes. Click to see the details for any search result. Note that search results will disappear if you move the cursor (focus) outside the search box, but will return when you focus there again.

Search terms will not autocomplete any suggestions as you type or detect any 'stem' words, i.e. searching for "foot" will not find "feet".

Path Information

When you click Show Path you will see the hierarchy that leads to your current selection.

Click the Path Information for a more complete picture of the same concept, including any Alternate Paths that may exist to the selection.

Add Concept Annotation

  • Open the field editor where you want to use concept annotations. This might mean editing the design of a list or the definition of a dataset.
  • Expand the field of interest.
  • Under Name and Linking Options, click Select Concept.
  • In the popup, select the ontology to use. If only one is loaded, you will skip this step.
  • In the popup, browse the ontology to find the concept to use.
  • Click Apply.
  • You'll see the concept annotation setting in the field details.
  • Save your changes.

View Concept Annotations

In the data grid, hovering over a column header will now show the Concept Annotation set for this field.

Edit Concept Annotation

To change the concept annotation for a field, reopen the field in the field editor, click Concept Annotation, make a different selection, and click Apply.

Related Topics




Ontology Column Filtering


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Ontology lookup columns support filtering based on concept and path within the ontology. Filtering based on whether a given concept is in an expected subtree (or not in an unexpected one) can isolate desired data using knowledge of the concept hierarchy.

Supported Filtering Expressions

Fields of type "Ontology Lookup" can be filtered using the following set of filtering expressions:

Find By Tree

To use the ontology tree browser, click the header for a column of type "Ontology Lookup" and select Filter.... You cannot use this filtering on the "import" or "label" fields related to your lookup, only the "code" value, shown here as "Medication Code".

Select the Choose Filters tab, then select a concept using the Find <Column Name> By Tree link.

The browser is similar to the concept browser. You can scroll or type into the "Search" bar to find the concept you want. Click the concept to see it within the hierarchy, with parent and first children expanded.

When you locate the concept you want, hover to reveal a filter icon. Click it to place the concept, with code, in the filter box. when using the subtree filter expressions you'll see the path to the selected concept. Shown below, we'll filter for concepts in the subtree under "Analgesic Agent".

Click Close Browser when finished. If needed, you can add another filter before saving by clicking OK.

Subtree Filter Expressions

The example above shows how a subtree filter value is displayed. Notice the slashes indicating the hierarchy path to the selected concept.

  • Is In Subtree: The filter will return values from the column that are in the subtree below the selected concept.
  • Is Not In Subtree: The filter will return values from the column that are in the subtree below the selected concept.
These subtree-based expressions can be combined to produce compound filters, such as medications that are "analgesic agents" but not "adjuvant analgesics":

Single Concept Filter Expressions

When using the Equals or Does Not Equal filtering expressions, browse the tree as above and click the filter icon. The code will be shown in the box.

Set of Concepts Filter Expressions

The filter expressions Equals One Of and Does Not Equal Any Of support multiselection of as many ontology concepts as necessary. Click the filter icons to build up a set in the box, the appropriate separator will be inserted.

Related Topics




Ontology Lookup


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Ontology Lookup Data Type

Once an ontology has been loaded and the module enabled in your folder, the field editor will include an Ontology Lookup data type for all column types. Any column may be defined to use the standard vocabularies available in a selected ontology source.

There can be up to three related fields in the data structure. The naming does not need to match these conventions, but it can be helpful to clarify which columns contain which elements:

  • columnName_code: The ontology standard code, a string which can either be provided or retrieved from the ontology. This column is configured as an Ontology Lookup and may have one or both of the following related columns:
  • columnName_import: (Optional) The reported, or locally used term that could be imported with the data. If no code is provided, the ontology lookup mechanism will seek a unique code value (and label) matching this term.
  • columnName_label: (Optional) The preferred or standard term retrieved from the ontology. This field is read-only and set by the ontology lookup mechanism as a user-readable supplement to the code value.

Lookup Rules

On data import (or update), the following ontology lookup actions will occur:

  • If only an "*_import" value is provided, the "*_label" and "*_code" will be populated automatically if found in the ontology as a concept or synonym.
    • Note that a code value can be provided as the "*_import" value and will be a successful lookup, provided it is included in the ontology as a synonym for itself.
  • If the provided "*_import" happens to match a concept "*_label", it will be shown in both columns: retained as the "*_import" and populated by the lookup as the "*_label".
  • If only a code value is provided directly in the "*_code" column, the label will be populated from the ontology. (No "*_import" value will be populated.)
    • Note that when the code value is provided in the "*_code" column, it must be prefixed with the ontology abbreviation. Ex: "NCI:c17113", not just "c17113".
  • If both an "*_import" and "*_code" value are provided (even if null), the "*_code" is controlling. Note that this means that if there is a "*_code" column in your data, it must contain a valid code value, otherwise the "null" value of this column will "overrule" the lookup of any "_import" value provided. This means:
    • For data insert or bulk update, you should include either the "*_code" column or the "*_import" column, but not both, unless the "*_code" is always populated.
    • For data update in the grid, providing an "*_import value will not look up the corresponding code; the "*_code" column is present (and null during the update), so will override the lookup.

Usage Example

For example, if you had a dataset containing a disease diagnosis, and wanted to map to the preferred label and code values from a loaded ontology, you could follow these steps:

  • Create three fields for your "columnName", in this case "Disease", with the field suffixes and types as follows:
Field NameData Type 
Disease_importTextThe term imported for this disease, which might be a synonym or could be the preferred term.
Disease_labelTextThe standard term retrieved from the ontology.
Disease_codeOntology LookupWill display the standard code from the ontology and provides the interconnection

  • Expand the "columnName_code" field.
  • Under Ontology Lookup Options:
    • Choose an Ontology: The loaded ontologies are all available here. Select which to use for this field.
    • Choose an Import Field: Select the "columnName_import" field you defined, here "Disease_import"
    • Choose a Label Field: Select the "columnName_label" field you defined.
  • Save.

When data is imported, it can include either the "*_import" field or the "*_code" field, but not both. When using the "*_import" field, if different terms for the same disease are imported, the preferred term and code fields will standardize them for you. Control which columns are visible using the grid customizer.

Shown here, a variety of methods for entering a COVID-19 diagnosis were provided, including the preferred term for patient PT-105 and the code number itself for PT-104. All rows can be easily grouped and aggregated using the label and code columns. The reported "*_import" value is retained for reference.

Concept Picker for Insert/Update with Type-ahead

When an Ontology Lookup field, i.e. the "*_code" field, is included in the insert or update form, you will see a "Search [Ontology Name]" placeholder. Type ahead (at least three characters, and full words for narrower lists) to quickly browse for codes that match a desired term. You'll see both the terms and codes in a dropdown menu. Click to select the desired concept from the list.

You'll see the code in the entry box and a tooltip with the preferred label on the right.

Alternately, you can use the Find [column name] By Tree link to browse the full Ontology to find a desired code. Shown below, "Medication Code" is the Ontology Lookup, so the link to click reads Find Medication Code By Tree.

Use the search bar or scroll to find the concept to insert. As you browse the ontology concepts you can see the paths and synonyms for a selected concept to help you make the correct selection. If the field has been initialized to an expected concept, the browser will auto scroll to it.

Click Apply to make your selection. You'll see the code in the entry box and a tooltip with the preferred label on the right as in the typeahead case above.

Initialize Expected Vocabulary

For the Ontology Lookup field, you choose an Ontology to reference, plus optional Import Field, and Label Field. In addition you can initialize the lookup field with an Expected Vocabulary making it easier for users to enter the expected value(s).

  • Open the field editor, then expand the Ontology Lookup field.
  • Select the Ontology to use (if it is not already selected).
  • Click Expected Vocabulary to open the concept picker.
  • Search or scroll to find the concept of interest. Shown here "Facial Nerve Palsy" or "NCIT:C26769".
  • Click Apply.
  • The selected concept is now shown in the expanded field details.
  • Save these changes.
  • Now when you use the concept picker for insert/update, it will select your concept of choice by default.
  • Note that this does not restrict the ultimate selection made by the user, it just starts their browsing in the section of the ontology you chose.

Related Topics




Ontology SQL


Premium Feature — Available in the Enterprise Edition of LabKey Server. Learn more or contact LabKey.

Once ontologies are loaded and enabled, you can also make direct use of them in SQL queries. The syntax described in this topic helps you access the preferred vocabularies and wealth of meaning contained in your ontologies.

Ontology Usage in SQL

The following functions and annotations are available:

  • Functions:
    • IsSubClassOf(conceptX, conceptParent): Is X a subclass of Parent?
    • IsInSubTree(conceptX, ConceptPath(..)): Is X in the subtree specified?
    • ConceptPath([conceptHierarchy], conceptParent): Select this unique hierarchy path so that I can find X in it using "isInSubTree".
    • table.findColumn([columnProperties]): Find a desired column by concept, conceptURI, name, propertyURI, obsolete name.
  • Annotation:
    • @concept=[CODE]: Used to reference ontology concepts in column metadata.

IsSubClassOf

Usage:

IsSubClassOf(conceptX, conceptParent)
  • Returns true if conceptX is a direct or indirect subclass of conceptParent.

IsInSubtree

Usage:

IsInSubtree(conceptX, ConceptPath(conceptA,conceptB,conceptParent))
  • Returns true if conceptX is contained in a subtree rooted at the unique path that contains .../conceptA/conceptB/conceptParent/.
  • If there is no such unique path the method returns false.
  • If there is such a unique path, but conceptX is not in this tree the method returns false.

ConceptPath

Usage:

ConceptPath(conceptA,conceptB,...,conceptParent)
  • This method takes one or more arguments, and returns the unique hierarchy path that contains the provided concepts in sequence (no gaps). The return value comes from the ontology.hierarchy.path column.
  • This method will return null if no such path is found, or if more than one path is found.
  • Note that the hierarchy paths may not be user readable at all, and may be more likely to change than the concept codes which are very stable. So this usage is preferable to trying to use hierarchy paths directly.
Performance note: It is faster to ask if conceptX belongs to a subtree ending in conceptParent than it is to ask whether conceptX is a subclass of conceptParent.

For performance we store all possible paths in the “subclass” hierarchy to create a pure tree, rather than a graph of subclass relations. This makes it much easier to answer questions like select all rows containing a ‘cancer finding’. This schema means that internally we are really querying the relationship between paths, not directly querying concepts. Therefore ConceptIsSubClass() is more complicated to answer than ConceptIsInSubtree().

@concept

The @concept annotation can be used to override the metadata of the column with a concept annotation.

Usage:

SELECT 'Something' as "Finding" @concept=C3367 FROM T WHERE ...

table.findColumn

To find a column in a given table by concept, conceptURI, name, propertyuri, or obsolete name , use findColumn on your table:

table.findColumn([columnProperties])

For example, if you've annotated a column with the concept coded "ONT:123", use the following to return the column with that concept:

SELECT
MyTable.findColumn(@concept='ONT:123')
FROM Project.MyStudy.study.MyTable

Examples

In these examples, we use a fictional ontology nicknamed "ONT". A value like "ONT:123" might be one of the codes in the ontology meaning "Pharma", for example. All SQL values are just string literals of the concept code, including a readable name in a comment in these examples is for clarity.

Here, "ONT:123" (Pharma) appears in the hierarchy tree once as a root term; "ONT:382" (Ibuprofen, for example) appears twice in the hierarchy 'below' Pharma, and has a further child: "ONT:350" (Oral form ibuprofen). Other codes are omitted for readability:

ONT:123 (Pharma) / biologic product / Analgesic / Anti-inflammatory preparations / Non-steroidal anti-inflammatory agent / ONT:382 (Ibuprofen) / ONT:350 (Oral form ibuprofen)

ONT:123 (Pharma) / biologic product / Analgesic / Non-opioid analgesics / Non-steroidal anti-inflammatory agent / ONT:382 (Ibuprofen) / ONT:350 (Oral form ibuprofen)

The two expressions below are not semantically the same, but return the same result. The second version is preferred because it ensures that there is only one path being evaluated and might be faster.

IsSubClassOf('ONT:382' /* Ibuprofen */, 'ONT:123' /* Pharma */)

IsInSubtree('ONT:382' /* Ibuprofen */, ConceptPath('ONT:123' /* Pharma */)

The next two expressions do not return the same result. The first works as expected:

IsSubClassOf('ONT:350' /* Oral form ibuprofen */, 'ONT:382' /* Ibuprofen */)

IsInSubtree('ONT:350' /* Oral form ibuprofen */, ConceptPath('ONT:382' /* Ibuprofen */)

Since there is not a unique concept path containing 'ONT:382' (Ibuprofen), the value of ConceptPath('ONT:382' /* Ibuprofen */) is NULL in the second row. Instead, the following expression would work as expected, clarifying which "branch" of the path to use:

IsInSubtree('ONT:350' /* Oral form ibuprofen */, ConceptPath('ONT:322' /* Non-opioid analgesics */, 'ONT:164' /* Non-steroidal anti-inflammatory agent */, 'ONT:382' /* Ibuprofen */)

Related Topics