Publish a Study: Protected Health Information / PHI: /Documentation

Publish a Study: Protected Health Information / PHI

When publishing a study, you can randomize or hide specified protected health information (PHI) in the data, to make it more difficult to identify the persons enrolled in the study. You can alter published data in the following ways:

Replace all participant IDs with alternate, randomly generated participant IDs.
Apply random date shifts/offsets. Note that data for an individual participant is uniformly shifted, preserving relative dates in results.
Mask clinic names with a generic "Clinic" label to hide any identifying features in the original clinic name.
Exclude data fields (columns) you have marked as containing PHI.

Publish Options

The wizard used to publish a study includes these options:

Use Alternate Participant IDs
Shift Participant Dates
Mask Clinic Names
Include (or Exclude) PHI Fields

Use Alternate Participant IDs

Selecting this option replaces the participant IDs throughout the published data with alternate, randomly generated ids. The alternate id used for each participant is persisted in the source study and reused for each new published study (and for exported folder archives) when the "Use Alternate Participant IDs" option is selected. Admins can set the prefix and number of digits used in this alternate id if desired. Learn more in this topic:

Manage Participants

Shift Participant Dates

Selecting this option will shift published dates for associated participants by a random offset between 1 and 365 days. A separate offset is generated for each participant and that offset is used for all dates associated with that participant, unless they are excluded as described below. This obscures the exact dates, protecting potentially identifying details, but maintains the relative differences between them. Note that the date offset used for a given participant is persisted in the source study and reused for each new published study.

To exclude individual date/time fields from being randomly shifted on publication:

Go to the dataset that includes the date field.
Edit the dataset definition.
In the designer, open the Fields section.
Expand the date field of interest.
Click Advanced Settings.
Check the box to Exclude From "Participant Date Shifting" on export/publication.
Click Save.

Include (or Exclude) PHI Fields

Fields/columns in study datasets and lists can be tagged as containing PHI (Protected Health Information). Once fields are tagged, you can use these levels to exclude them from the published study in order to allow data access to users not permitted to see the PHI.

Learn more about tagging fields as containing PHI in this topic:

Protecting PHI Data

When you publish a study, use the radio buttons to indicate the PHI level to include. To exclude all PHI, select "Not PHI".

For example, if the study is published including Limited PHI, then any field tagged as "Full PHI" or "Restricted PHI" will be excluded from the published version of the study.

Mask Clinic Names

When this option is selected, actual clinic names will be replaced with a generic label. This helps prevent revealing neighborhood or other details that might identify individuals. For example, "South Brooklyn Youth Clinic" is masked with the generic value "Clinic".

All locations that are marked as a clinic type (including those marked with other types) will be masked in the published data. More precisely, both the Label and Labware Lab Code will be masked. Location types are specified by directly editing the labs.tsv file. For details see Manage Locations.

LabKey Support

LabKey Support