Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

This topic describes how to use existing ETL (Extract Transform Load) processes from within the user interface.

ETL User Interface

The web part Data Transforms lists all of the ETL processes that are available in the current folder.

  • You can also access this interface via (Admin) > Go To Module > Data Integration.
  • If you don't know how to add a web part, learn how in this topic: Add Web Parts.
Columns:
  • Name - This column displays the name of the process.
  • Source Module - This column tells you module where the configuration file resides.
  • Schedule - This column shows you the reload schedule.
  • Enabled - This checkbox controls whether the automated schedule is enabled: when unchecked, the ETL process must be run manually.
  • Last Status, Successful Run, Checked - These columns record the latest run of the ETL process.
  • Run: Click Run Now to run this ETL.
  • Set Range (Available only in devMode): The Set Range column is displayed only in dev mode and is intended for testing purposes during ETL module development.
    • The Run button is only displayed for ETL processes with a filter strategy of RunFilterStrategy or ModifiedSinceFilterStrategy; the button is not displayed for the filter strategy SelectAllFilterStrategy.
    • Click Run to set a date or row version window range to use for incremental ETL filters, overriding any persisted or initial values.
  • Reset: Use the options on the Reset State button to return the ETL process to its original state, deleting its internal history of which records are, and are not, up to date. There are two options:
    • Reset
    • Truncate and Reset
  • Last Transform Run Log Error - Shows the latest error logged, if any exists.
At the bottom of the ETL listing, the View Processed Jobs button shows you a log of all previously run ETL jobs, and their status.

Run an ETL Process Manually

From the Data Transforms web part, you can:

  • Run jobs manually, even if a schedule is also configured: Click Run Now.
  • Reset state: Select Reset State > Reset resets an ETL transform to its initial state, as if it has never been run.

Enable/Disable a Scheduled ETL

In the Data Transforms web part, you can see in the Schedule column whether an ETL is configured to run on a schedule, and if so how frequently.

Use the checkbox in the Enabled column to control whether that ETL will be run on that schedule or not. When the box is unchecked, the ETL will not be run on a schedule, but may still be run manually if needed.

Learn more about scheduling ETLs in this topic: ETL: Schedules.

Cancel and Roll Back Jobs

While a job is running you can cancel and roll back the changes made by the current step by pressing the Cancel button.

The Cancel button is available on the Job Status panel for a particular job, as shown below:

To roll back a run and delete the rows added to the target by the previous run, view the Data Transforms web part, then select Reset State > Truncate and Reset. Note that rolling back an ETL which outputs to a file will have no effect, that is, the file will not be deleted or changed.

See Run History

The Data Transform Jobs web part provides a detailed history of all executed ETL runs, including the job name, the date and time when it was executed, the number of records processed, the amount of time spent to execute, and links to the log files.

To add this web part to your page, enter > Page Admin Mode, then scroll down to the bottom of the page and click the dropdown <Select Web Part>, select Data Transform Jobs, and click Add. When added to the page, the web part appears with a different title: "Processed Data Transforms". Click Exit Admin Mode.

Click Run Details for fine-grained details about each run, including a graphical representation of the run.

History of ETL Jobs

To view a history of all ETL jobs ever run across the whole site:

  • Go to Admin > Site > Admin Console.
  • Under Management, click ETL- All Job Histories.
The history includes the name of the job, the folder it was run in, the date and time it was run, and other information. Links to detailed views of each job and run are provided.

Related Topics

Was this content helpful?

Log in or register an account to provide feedback


previousnext
 
expand allcollapse all