This topic shows how to create, edit, and delete ETL XML definitions directly in the LabKey Server user interface, which obviates the need to deploy an ETL inside a custom module. You can also use existing ETLs as templates, streamlining creation steps.
Enable the Data Integration Module
- In the folder where you want to create an ETL, go to (Admin) > Folder > Management.
- On the Folder Management page, click the Folder Type tab.
- Under Modules, place a checkmark next to Data Integration.
- Click Update Folder to save your changes.
- Go back to the Folder Management page and notice that the ETLs tab has been added.
Create a New ETL Definition
- Go to (Admin) > Folder > Management.
- On the Folder Management page, click the ETLs tab.
- On the Custom ETL Definitions panel, click the (Insert new row) button.
- You will be provided with template XML for a new ETL definition.
- Edit the provided XML to fit your use case:
- Provide a name.
- Provide a description.
- Uncomment the transform element.
- Replace the default values in the source and destination elements.
- An example ETL that copies data from some external table to a list:
<etl xmlns="http://labkey.org/etl/xml">
<name>Populate List</name>
<description>Updates the List data from the external data source.</description>
<transforms>
<transform id="step1" type="org.labkey.di.pipeline.TransformTask">
<description>Copy data to the List</description>
<source schemaName="external" queryName="SourceTable" />
<destination schemaName="lists" queryName="MyList" />
</transform>
</transforms>
<incrementalFilter className="ModifiedSinceFilterStrategy" timestampColumnName="modified"/>
<schedule>
<poll interval="1h" />
</schedule>
</etl>
Autocomplete
While using the editor, autocomplete using
CodeMirror makes it easier to enter XML syntax correctly and remember valid parameter names.
Type a '<' to see XML syntax options for that point in the code:
Type a space to see the list of valid parameters:
Note that the autocomplete menu for the <destination> element offers "targetOptionType" and "marge" which are not valid parameters. Use "targetOption" and "merge" instead.
Change ETL Names/Save As
When you edit an existing ETL and change the
name field then click
Save, the name is first checked against all existing ETL names in the folder. If it is not unique, you will see a popup warning "This definition name is already in use in the current folder. Please specify a different name."
Once you click
Save with a unique new name, you will be asked if you want to update the existing definition or save as a new definition.
If you click
Update Existing, there will only be the single changed ETL after the save which will include all changes made.
If you click
Save as New, there will be two ETL definitions after the save: the original content from any previous save point, and the new one with the new name and most recent changes.
Use an Existing ETL as a Template
To use an existing ETL as a template for creating a new one, click the
Copy From Existing button in the ETL Definition editor.
Choose the
location (project or folder) to populate the dropdown for
Select ETL Definition. Choose a definition, then click
Apply.
The XML definition you chose will be shown in the ETL Definition editor, where you can make further changes before saving. The
name of your ETL definitions must be unique in the folder, so the name copied from the template must always be changed. This name change does not prompt the option to update the existing template. An ETL defined using a template always saves as a new ETL definition.
Note that the ETL used as a template is not linked to the new one you have created. Using a template is copying the XML at the time of template use. If edits are made later to the "template," they will not be reflected in the ETLs that used it.
Run the ETL