This topic shows how to create, edit, and delete ETL XML definitions directly in the LabKey Server user interface. You can use existing ETLs as templates, streamlining creation steps. ETLs defined in the user interface are available in the container where they are defined. If you want an ETL to be available more widely, you will either need to copy it or deploy your ETL inside a custom module.Prerequisites
: Before you can use the interface shown here, you must enable the Data Integration Module
in the folder where you want to define an ETL. If you will be extracting data from (or loading data to) an external datasource, set up the necessary schemas following this topic: External Schemas and Data Sources
Create a New ETL Definition
- Go to (Admin) > Folder > Management.
- On the Folder Management page, click the ETLs tab.
- On the Custom ETL Definitions panel, click the (Insert new row) button.
- You will be provided with template XML for a new basic ETL definition. Learn more about this starter template below
Edit the provided XML template to fit your needs. Review the basic XML syntax in the next section
Basic Transform Syntax
The default XML shown when you create a new ETL, in the image above, shows all the basic elements of an ETL. You will want to edit these sections to suit your needs.
- Provide a name where "Add name" is shown above. This will be shown to the user.
- Provide a description where "Add description" is shown above.
- The transforms element is the heart of the ETL and describes the work that will be done. ETLs may include multiple transforms, but at least one is required.
- Uncomment and customize the transform element.
- The id gives it a unique name in this ETL.
- The type must be one of the transform types defined in the dataintegration module.
- A description of this transform step can help readers understand it's actions.
- The source element identifies the schemaName and queryName from which data will be pulled. Learn about referencing outside sources here: External Schemas and Data Sources
- The destination element identifies the schemaName and queryName to which data will be pushed. Learn about referencing outside destinations here: External Schemas and Data Sources. Learn more about options for the destination here: ETL: Target Options.
- The incrementalFilter element is where you might apply filtering to which rows the transforms will be applied to.
- The schedule element indicates the interval between runs if scheduled running is enabled for this ETL.
While using the editor, autocomplete using CodeMirror
makes it easier to enter XML syntax correctly and remember valid parameter names.
Type a '<' to see XML syntax options for that point in the code:
Type a space to see the list of valid parameters or options:
Once you've customized the XML in the panel, click Save
Change ETL Names/Save As
When you edit an existing ETL and change the name
field then click Save
, the name is first checked against all existing ETL names in the folder. If it is not unique, you will see a popup warning "This definition name is already in use in the current folder. Please specify a different name."
Once you click Save
with a unique new name, you will be asked if you want to update the existing definition or save as a new definition.
If you click Update Existing
, there will only be the single changed ETL after the save which will include all changes made.
If you click Save as New
, there will be two ETL definitions after the save: the original content from any previous save point, and the new one with the new name and most recent changes.
Use an Existing ETL as a Template
To use an existing ETL as a template for creating a new one, click the Copy From Existing
button in the ETL Definition editor.
Choose the location
(project or folder) to populate the dropdown for Select ETL Definition
. Choose a definition, then click Apply
The XML definition you chose will be shown in the ETL Definition editor, where you can make further changes before saving. The name
of your ETL definitions must be unique in the folder, so the name copied from the template must always be changed. This name change does not prompt the option to update the existing template. An ETL defined using a template always saves as a new ETL definition.
Note that the ETL used as a template is not linked to the new one you have created. Using a template is copying the XML at the time of template use. If edits are made later to the "template," they will not be reflected in the ETLs that used it.
Run the ETL