Participant Aliases: /Documentation

Participant Aliases

A single study participant can be known by different names in different research contexts. One lab might refer to the subject using the name "LK002-234001", whereas another lab might use the name "Primate 44". It is often necessary (and even desirable) to let the different providers of data use their own names, rather than try to standardize them at the various sources. LabKey Server can align all the data for a given subject using Participant Aliases.

Overview
Provide Participant Alias Mapping Dataset

Example Alias Map

Import Data
Resolve Name Conflicts
View Original IDs or Aliases Provided

Overview

When data is imported, a provided alias map will "translate" the aliases into the common participant ID for the study. In our example, the two labs might import data for "LK002-234001" and "Primate 44", with the study researcher seeing all these rows under the participant ID "PT-101". It is important to note that there is no requirement that all data is imported using one of the aliases you provide, you can also use the "real" participant ID.

LabKey Server's aliasing system works by internally mapping different aliases to a central participant id, while externally preserving the aliases known to the different data sources. This allows for:

Merging records with different IDs for the same study subject
Consolidating search results around a central ID
Retaining data as originally provided by a source lab
Presenting data back to a given audience using their alias

Provide Participant Alias Mapping Dataset

To set up participant aliases, point to a dataset that contains as many aliases as needed for each participant ID. The participantID column contains the 'central' IDs, another column contains the aliases, and a third column identifies the source organizations that use those aliases.

Create a dataset containing the alias and source organization information. See below for an example.
Select > Manage Study > Manage Participants.
Under Participant Aliases, select your Dataset Containing Aliases.
Select the Alias Column.
Select the Source Column, i.e. the source organization using that alias.

Click Save Changes.
Click Done.

Once an alias has been defined for a given participant, an automatic name translation is performed on any imported data that contains that alias. For example, if participant "PT-101" has a defined alias "Primate 44", then any data containing a reference to "Primate 44" will be imported as "PT-101".

Once you have an alias dataset in place, you can add more records to it manually (by inserting new rows) or in bulk, either directly from the grid or by returning to the Participant Aliases page and clicking Import Aliases.

To clear all alias mappings, but leave the alias dataset itself in place, click Clear All Alias Settings.

Example Alias Map

An example alias mapping file is shown below. Because it is a study dataset, the file must always contain a date (or visit) column for internal alignment. Note that while these date/visit columns are not used for applying aliases, you must have unique rows for each participantID and date. One way to manage this is to have a unique date for each source organization, such as is shown here:

ParticipantId	Aliases	SourceOrganization	Date
PT-101	Primate 44	ABC Labs	10/10/2010
PT-101	LK002-234001	Research Center A	11/11/2011
PT-102	Primate 45	ABC Labs	10/10/2010
PT-103	Primate 46	ABC Labs	10/10/2010

Another way to manage row uniqueness is to use a third dataset key. For example, if there might be many aliases from a given source organization, you could use the aliases column itself as the third key to ensure uniqueness. If an alias value appears twice in the aliases column, you will get an error when you try to import data using that alias.

Import Data

Once the map is in place, you can import data as usual, either in bulk, by single row, or using an external reloading mechanism. Your data can use the 'common' participant IDs themselves (PT-### in this example) or any of the aliases. Any data added under one of the aliases in the map will be associated with the translated 'common' participantID. This means you will only see rows for "PT-###" when importing data for the aliases shown in our example.

If any data is imported using a participant ID that is not in the alias map, a new participant with the 'untranslated' alias will be created. Note that if you later add this alias to the map, the participant ID will not be automatically updated to use the 'translated' ID. See below for details about resolving these or other naming conflicts.

Resolve Naming Conflicts

If incoming data contains an ID that is already used for a participant in the study AND is provided as an alias in the mapping table, the system will raise an error so that an administrator can correctly resolve the conflict.

For example, if you were using our example map shown above, and imported some data for "Primate 47", you would have the following participant IDs in your study: "PT-101, PT-102, PT-103, Primate 47". If you later added to the mapping "PT-104" with the alias "Primate 47", and then imported data for that participant "Primate 47", you would see the error:

There is a collision, the alias Primate-47 already exists as a Participant. You must remove either the alias or the participant.

The easiest way to reconcile records for participants under 'untranslated' aliases, is to manually Change ParticipantID of the one with the untranslated alias to be the intended shared ID. This action will "merge" the data for the untranslated alias into the (possibly empty) new participant ID for the translated one.

View Original Ids or Aliases Provided

Note that clearing the alias settings in a study does not revert the participant ids back to their original, non-aliased values. Alias participant ids are determined at import time and are written into the imported dataset. But you can add a column to your datasets that shows the original ParticiantIds, which can be useful for displaying the dataset to the source organization that knows the participant by the non-aliased id. To display the original ParticipantId values, add the Aliases column to the imported dataset as follows:

Navigate to the dataset where you want to display the original, non-alias ParticipantIds.
Select (Grid Views) > Customize Grid.
Expand the ParticipantID node.
Check the box for Aliases.
Expand the Linked IDs node and you will see a checkbox for each source organization. Check one (or more) to add the alias that specific source used to your grid.
Save the grid view, either as the default grid view, or as a named view.
The dataset now displays both the alias(es) and the original id.

LabKey Support

LabKey Support