Data exported from LabKey Server can be protected by:
- Randomizing participant ids so that the original participant ids are obscured.
- Shifting date values, such as clinic visits and specimen draw dates. (Note that dates are shifted per participant, leaving their relative relationships as a series intact, thereby retaining much of the scientific value of the data.)
- Holding back data that has been marked as 'protected'.
In this this step we will export data out of the study, modifying and obscuring it in the ways described above.
Examine Study Data
First look at the data to be exported.
- Navigate to the Study folder (in Security Tutorial).
- Click the Clinical and Assay Data tab. This tab shows the individual datasets in the study. There are currently two datasets: "Participants" and "Physical Exam".
- Click Physical Exam. Notice that the participant ids are 6 digit numbers, starting with "110349". When we export this table, we will randomize these ids, to make it more difficult to identify the subjects of the study.
- Return to the Clinical and Assay Data tab.
- Click Participants in the dataset list. Notice the dates in the table are almost all from the last two weeks of April 2008. When we export this table, we will randomly shift these dates, to make it more difficult to identify when subject data was collected.
- Notice the columns for Country and Gender. We will mark these as "protected" columns, so they are not exported. (As an example, given that there is exactly one male patient from Germany in our sample, he would be easy to identify with only this information.)
Mark Protected Columns
To prepare the data for export, we will mark two columns, "Gender" and "Country" as protected columns making them non-exportable.
- Click the Manage tab. Click Manage Datasets.
- Click Participants (the dataset, not the tab) and then Edit Definition.
- Under Dataset Fields select Gender.
- Click the Advanced tab and place a checkmark next to Protected.
- Repeat for the Country field.
Set up Alternate Participant IDs
Next we will configure how participant ids are handled on export. We will specify that the ids are randomized using a given text and number pattern.
- Click the Manage tab.
- Click Manage Alternate Participant IDs and Aliases.
- For Prefix, enter "ABC".
- Click Change Alternate IDs. Click to confirm.
- Scroll down and click Done.
Export/Publish Anonymized Data
Now we are ready to export this data, using the extra data protections in place.
This procedure will "Publish" the study. That is, a new child folder will automatically be created and selected data from the study will be randomized and copied to the child folder. Once the child folder appears with the exported data, you can configure its security as fits your requirements.
- If necessary click the Manage tab.
- Scroll down and click Publish Study.
- Complete the wizard, selecting all participants, datasets, and timepoints in the study. For fields not mentioned here, enter anything you like.
- Under Publish Options, check the following options:
- Use Alternate Participant IDs
- Shift Participant Dates
- Remove Protected Columns
- You could also check Mask Clinic Names which would protect any actual clinic names in the study by replacing them with a generic label "Clinic."
- Click Finish.
- Wait for the publishing process to finish.
- Navigate to the new folder (a child folder under Study).
- Look at the published datasets Physical Exam and Participants. Notice how the participant ids and dates have been randomized. Notice that the Gender and Country fields have been held back (not been published).
Security for the New Folder
How should you configure the security on this new folder?
The answer depends on your requirements.
- If you want the general public to see this data, you would add Guests to the Reader role. This allows non-logged-in users to see the folder.
- If want only members of the study team to have access, you would add Study Group to the Reader role, or a higher role.
For details on the different roles that are available see Security Roles Reference