In this step, we begin to learn to create new ETLs for other operations. Suppose you wanted to expand the Patients
dataset to also include male participants who are "Natural Controllers" of HIV.
To do this, we use another SQL query
that returns a selection of records from the Demographics table, in particular all Male participants who are Natural Controllers.
We'll create a new ETL from scratch, drawing on that SQL query.
Define Source Query
The archive has predefined the query we will use. To review it and see how you could add a new one, follow these steps:
- Select (Admin) > Go To Module > Query.
- Click study to open the study schema. If you were going to define your own new query, you could click Create New Query here.
- Click MaleNC to open the predefined one.
- Click Edit Source to see that the source code for this query looks like this:
WHERE Demographics.Gender = 'm' AND Demographics.TreatmentGroup = 'Natural Controller'
- Click the Data tab to see that 6 participants are returned by this query:
Create a New ETL Process
ETL processes are defined using XML to specify the data source, the data target, and other properties. You can install these XML files in a custom module, or define the ETL directly using the user interface. Here we create a new configuration that draws from the query we just created above.
- Select (Admin) > Folder > Management.
- Click the ETLs tab.
- Above the Custom ETL Definitions grid, click (Insert new row).
- Copy and paste the following instead of the default shown in the window:
<name>Demographics >>> Patients (Males)</name>
<description>Update data for study on male patients.</description>
<source schemaName="study" queryName="MaleNC"/>
<destination schemaName="study" queryName="Patients" targetOption="merge"/>
- Click Save.
- Click the ETL Workspace tab.
- Notice this new ETL is now listed in the Data Transforms web part.
Run the ETL Process
- Click Run Now next to the new process name.
- Refresh in the pipeline window until the job completes, then click the ETL Workspace tab.
- New records will have been copied to the Patients table, making a total of 43 records (42 if you skipped the step of changing the gender of a participant in the source data during the previous tutorial step).
Congratulations! You've completed the tutorial and created a basic ETL for extracting, transforming, and loading data. Learn more in the ETL Documentation