ETL: Stored Procedures: /Documentation/Archive/22.3

ETL: Stored Procedures

Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

The ETL itself runs on your LabKey Server. It can also call Stored Procedures; scripts that will run on either the source or target database. These scripts live outside the ETL framework, but are referenced and sequenced within it, generally as part of a multistep ETL. Using stored procedures may be of interest if you are already using a legacy set of business rules or other SQL scripts. It also allows you to write in the native SQL for your data source, if that is of interest.

This topic describes how to use stored procedures in your etl. Companion topics cover special behavior for specific database types.

Stored Procedures as Source Queries
Use Transaction for Procedure
Pass Parameters to Stored Procedures
Behavior/Examples for Different Database Implementations

Stored Procedures as Source Queries

Instead of extracting data directly from a source query and loading it into a target query, an ETL process can call one or more stored procedures that themselves move data from the source to the target (or the procedures can transform the data in some other way). For example, the following ETL process runs a stored procedure to populate the Patients table. If you are familiar with the tutorial example, notice that here, the actual action, source, and destination are not visible in the ETL XML. Those details are assumed to be in the procedure named "PopulateExtendedPatients". It is good practice to name your stored procedures in an identifying way and use <description> elements to clarify the actions they will take.

<?xml version="1.0" encoding="UTF-8"?>
<etl xmlns="http://labkey.org/etl/xml">
    <name>Populate Patient Table</name>
    <description>Populate Patients table with calculated and converted values.</description>
    <transforms>
        <transform id="ExtendedPatients" type="StoredProcedure">
            <description>Calculates date of death or last contact for a patient, and patient ages at events of interest</description>
            <procedure schemaName="patient" procedureName="PopulateExtendedPatients">
            </procedure>
        </transform>
    </transforms>
    <!-- run at 3:30am every day -->
    <schedule><cron expression="0 30 3 * * ?"/></schedule>
</etl>

Use Transaction for Procedure

Wrap the call of the procedure in a transaction by setting the attribute useTransaction to true. This is also the default setting, so that if any part of the stored procedure fails, any partially-completed changes would be rolled back.

...
  <transform id="ExtendedPatients" type="StoredProcedure">
    <procedure schemaName="patient" procedureName="PopulateExtendedPatients" useTransaction="true">
    </procedure>
  </transform>
...

For debugging purposes, you can turn off the transaction wrapper setting useTransaction to "false".

Pass Parameters to Stored Procedures

You can include <parameters> to stored procedures within the <procedure> element. The value you provide could be hard coded in the ETL, as shown here,

...
  <transform id="checkToRun" type="StoredProcedure">
    <procedure schemaName="patient" procedureName="someTestWork" useTransaction="false">
        <parameter name="MyParam" value="4"/>
        <parameter name="AnotherParam" value="before"/>
    </procedure>
  </transform>
...

You can include a setting for noWorkValue on a parameter to block running of the procedure when the given parameter has that value. Learn more in this topic: ETL: Check For Work From a Stored Procedure

LabKey Support

LabKey Support