In LabKey Server, ETLs get data from some query and copy it to another query. In this step will create an ETL that queries a source table, retrieves records, and copies the records to a target query.

Create a Data Source Query

Suppose you want to assemble a database focused specifically on patients with a certain condition or diagnosis. In this step, we will add a new ETL that extracts selected patients from a source table (the Demographics table) and loads them into a separate target table (the Patients table). First we'll add a SQL query that returns selected records, then we will configure the ETL to load these records into the target table.

First we add a SQL query the returns a selection of records from the Demographics table, in particular all Female participants who are receiving anti-retroviral treatment (ARV):

  • Go to Admin > Developer Links > Schema Browser.
  • Select the study schema (in the left hand panel). Then select the Demographics table.
  • Click Create New Query.
    • In the field What do you want to call the new query, enter "Females-ARV".
    • In the field Which query/table do you want this new query to be based on?, confirm that "Demographics" is selected.
    • Click Create and Edit Source.
  • You are taken to the SQL query editor, and shown a basic SQL query on the Demographics table.
  • Add this final line to the SQL query:
WHERE Demographics.Gender = 'f' AND Demographics.TreatmentGroup = 'ARV'
  • Confirm that the SQL query looks like the following:
SELECT Demographics.ParticipantId,
Demographics.StartDate,
Demographics.Gender,
Demographics.PrimaryLanguage,
Demographics.Country,
Demographics.Cohort,
Demographics.TreatmentGroup
FROM Demographics
WHERE Demographics.Gender = 'f' AND Demographics.TreatmentGroup = 'ARV'
  • Click Save and Finish.

Create New Module

ETLs are created by adding a new "module" to LabKey Server. Modules are packets of functionality that you add to the server, functionality which is then surfaced in the web-based user interface in some way. Modules can add a wide variety of new functionality to the server, ranging from simple functionality like static web pages, to complex functionality, like new assay types, scripts, or even Java code. The module you create in this step is very simple, and incomplete: it is intended to get you started developing modules for the server, but it does not provide the full story behind developing new modules. For the full story behind developing modules see Develop Modules.

Note that there are different instructions for different LabKey Server environments:

  • "Production" environments: these are server instances created using either the graphical installer, or manually installed by an operations manager. Production servers are typically modified by administrators via the web-based user interface to build new applications. Once these applications have been assembled, they are ready for use by the general public or by individual institutions.
  • "Development" environments: these are server instances that were built directly from the LabKey Server source, using an integrated development environment, such as IntelliJ. Development environments are used by developers to extend the server's functionality in some way, by modifying the source code, code which is then compiled and tested before it is rolled out as a production server.
If you don't know which environment you are working with, you are probably working with a Production instance of LabKey Server.

Instructions for Production Instances of LabKey Server

  • If you are working with Production instance of LabKey Server, follow these instruction below. We do not recommend that you complete these steps on a public-facing server. Instead use a "test" server, or a server confined to a single local machine.
  • Using your operating system's file explorer, go to <LABKEY_HOME>/externalModules/, where <LABKEY_HOME> is your installation directory for LabKey Server. A typical value for <LABKEY_HOME> on Windows is C:/Program Files/LabKey Server/.
  • In the <LABKEY_HOME>/externalModules directory, create a new directory called "mymodule". Then, inside the mymodule directory, create another directory called "etls". The resulting directory structure should be:
<LABKEY_HOME>
externalModules
mymodule
etls

Instructions for Development Instances of LabKey Server

  • Using your operating system's file explorer, go to server/externalModules/
  • In the server/externalModules/ directory, create a new directory called "mymodule". Then, inside the mymodule directory, create another directory called "etls". The resulting directory structure should be:
server
externalModules
mymodule
etls
  • Run "ant build" to refresh the available modules on the server.

Add an ETL Configuration File

ETLs are defined by XML configuration files that specify the data source, the data target, and other properties.

  • In the etls directory, create new XML file called females-ARV.xml.
  • Copy the following into females-ARV.xml, and save.
<?xml version="1.0" encoding="UTF-8"?>
<etl xmlns="http://labkey.org/etl/xml">
<name>Demographics >>> Patients (Females)</name>
<description>Update data for study on female patients.</description>
<transforms>
<transform id="females">
<source schemaName="study" queryName="Females-ARV"/>
<destination schemaName="study" queryName="Patients" targetOption="merge"/>
</transform>
</transforms>
<schedule>
<poll interval="1h"/>
</schedule>
</etl>
  • Notice that this configuration file has our query (Females-ARV) as its source, and the Patients query as its target.
  • Refresh the browser window. Notice that the system will add your new module to the server. Click Next if necessary to complete the update.
  • Finally, enable the new module in your folder:
    • Go to Admin > Folder > Management and click the Folder Type tab.
    • Under Modules, place a checkmark next to mymodule. (If you don't see 'mymodule' listed, refresh the cache. Go to Admin > Site > Admin Console. Click Caches and then click Clear Caches and Refresh. Return to the Folder Management page and select mymodule.)
    • Click Update Folder.

Test the New ETL

  • Click the ETL tab.
  • When the tab reloads, our new ETL will appear in the Data Transforms pane: Demographics >>> Patients (Females).
  • Click Run Now to transfer the data to the Patients table.
  • You will be taken to the ETL Job page, which provides updates on the status of the running job.
  • Refresh your browser until you see the Status field shows the value COMPLETE
  • Click the ETLs tab to see the records that have been added to the Patients table. Notice that 36 records have been copied into the Patients query.
  • If you wish to run the ETL again, do either of the following:
    • Delete the records from the Patients table, and click Reset State button in the Data Transforms pane.
    • Edit the data in the source query Demographics. When you run the ETL again, it will merge your changes into the Patients query.


previousnext
 
expand allcollapse all