Each sample in the system must have a unique name/id within its Sample Type. The unique sample names can be provided by the user, or can be generated by the system. When you ask the system to generate names, you specify a naming pattern to use. For each type of sample, you will choose one of these two options. If you already use a unique naming structure outside the system, you will want to ensure those names are carried in to LabKey Sample Manager.
Unique Sample Names are Provided
If your data already includes the unique sample names to use, identify the column name that contains them.
Naming Column is "SampleID" or "Name"
If the name of this column is "SampleID" or "Name", these default column names are automatically recognized as containing sample names. To confirm that they are used, be sure to
Delete the default naming pattern that is provided in the user interface (and ignore the grayed out placeholder text that remains).
Naming Column is Something Else
If the column containing unique sample names is named something else, you provide that column name using a simple naming pattern expression that specifies the name of the column to use, rather than an expression to generate one.
For example, if the sample names are in a column named "Identifier", you would enter the naming pattern:
Note that while this is entered as a naming pattern, it does not generate any portion to make the sample names unique, so you are responsible for ensuring uniqueness.
Generate Names with Naming Patterns
If your data does not already contain unique names, the system can generate them upon import using a
naming pattern that contains tokens, including counters to ensure names are unique. The system can build a unique name from syntax elements, such as:
- String constants
- Incrementing numbers
- Dates and partial dates
- Values from columns in the imported data, such as tissue types, lab names, subject ids, etc.
- Separators such as '_' underscores and '-' hyphens
- Note that if you use a hyphen '-', you will want to use double quotes when you later search for your samples. An unquoted search for Sample-11 would interpret the hyphen as a minus sign and seek pages with "Sample" without "11".
Default Naming Pattern
The default naming pattern in Sample Manager generates names from two elements: the prefix "S-" plus an incrementing integer.
The first few samples would be:
See and Set genId
genId is an incrementing counter which starts from 1 by default, and is maintained internally for each sample type (or source type) in a container. When you include ${genId} in a naming pattern, you will see a blue banner indicating the current value of genId.
If desired, click
Edit genId to set it to a higher value than it currently is. This action will reset the counter and cannot be undone.
Date Based Naming
Another possible naming pattern for samples is to incorporate the date of creation. For example:
S-${now:date}-${dailySampleCount}
This three-part pattern will generate an incrementing series of samples for each day.
- The S- prefix is simply a string constant with a separator dash. Using separators like "-" and "_" is optional but will help users parse sample names.
- The now:date token will be replaced by the date of sample creation.
- The dailySampleCount token will be replaced by an incrementing counter that resets daily.
In this example, samples added on November 25, 2019 would be "S-20191125-1, S-20191125-2, etc.". Samples added on November 30 would be "S-20191130-1, S-20191130-2, etc".
Incorporate Column Values
If you want to use a column from your data as part of the name, but it does not contain unique values for all samples, you can incorporate it in the pattern by using the column name in token brackets and also including an additional uniqueness element like a counter. For example, if you want to name many samples for each participant, and the participant identifier is in a "ParticipantID" column, you could use the pattern:
${ParticipantID}-${genId}
Multiple column names and other substitutions can be included in a naming pattern, for example:
${ParticipantID}-${CollectionDate}-${LabName}-${now:date}-${genId}
Incorporate Lineage Lookups
More general syntax to include properties of sample sources or parents in sample names is also available by using lookups into the lineage of the sample.
- Specific data type inputs: MaterialInputs/SampleType1/propertyA, DataInputs/DataClass2/propertyA, etc.
- Import alias references: parentSampleA/propertyA, parentSourceA/propertyA, etc.
- In some scenarios, you may be able to use a shortened syntax referring to an unambiguous parent property: Inputs/propertyA, MaterialInputs/propertyA, DataInputs/propertyA
- This option is not recommended and can only be used when 'propertyA' only exists for a single type of parent. Using a property common to many parents, such as 'Name' will produce unexpected results.
For example, to include source metadata (e.g. my blood sample was derived from this mouse, I would like to put the mouse strain in my sample name), the derived sample's naming expression might look like:
Blood-${DataInputs/Mouse/Strain}-${genId}
You can use the qualifier
:first to select the first of a given set of inputs when there might be several.
If there might be multiple parent samples of a given type (like "Blood"), you could choose the first one in a naming pattern like this:
Blood-${parentBlood/SampleID:first}-${genId}
Include Grandparent/Ancestor Names
The above
lineage lookup syntax applies to parent or source details, i.e. the "parent" generation. In order to incorporate lookups into the "Grandparent" of a sample, use the ".." syntax to indicate "walking up" the lineage tree. Here are a few examples of syntax for retrieving names from deeper ancestry of a sample. For simplicity, each of these examples is shown followed by a basic ${genId} counter, but you can incorporate this syntax with other elements. Note that the shortened syntax available for first generation lineage lookup is not supported here. You must specify both the sample type and the "/name" field to use.
To use the
name from a specific grandparent sample type, use two levels:
${MaterialInputs/CurrentParent/..[MaterialInputs/GrandParentSampleType]/name}-${genId}
To use another
propertyColumn from a specific grandparent sample type:
${MaterialInputs/CurrentParent/..[MaterialInputs/GrandParentSampleType]/propertyColumn}-${genId}
You can use a
parent alias for the immediate parent level, but not for any grandparent sample types or fields:
${parentAlias/..[MaterialInputs/GrandParentSampleType]/name}-${genId}
Compound this syntax to further "walk" the lineage tree to use a
great grand parent sample type and field:
${MaterialInputs/CurrentParent/..[MaterialInputs/GrandParentSampleType]/..[MaterialInputs/GreatGrandSampleType]/greatGrandFieldName}-${genId}
To define a naming pattern that uses the
name of the grandparent of any type, you can omit the grandparent sample type name entirely. For example, if you had Plasma samples that might have any number of grandparent types, you could use the grandparent name using syntax like any of the following:
${plasmaParent/..[MaterialInputs]/name}-${genId}
${MaterialInputs/Plasma/..[MaterialInputs]/name}-${genId}
${MaterialInputs/..[MaterialInputs]/name}-${genId}
Naming Pattern Elements/Tokens
The following elements, or "tokens" are available for building naming patterns.
Name Element | Description | Scope0000000000000000000000000 |
---|
|
genId | An incrementing number starting from 1. This counter is specific to the individual Sample Type in a given container. | Current Sample Type |
dailySampleCount | An incrementing counter, starting with the integer '1', that resets each day. Can be used standalone or as a modifier. | All Sample Types and Source Types on the site |
weeklySampleCount | An incrementing counter, starting with the integer '1', that resets each week. Can be used standalone or as a modifier. | All Sample Types and Source Types on the site |
monthlySampleCount | An incrementing counter, starting with the integer '1', that resets each month. Can be used standalone or as a modifier. | All Sample Types and Source Types on the site |
yearlySampleCount | An incrementing counter, starting with the integer '1', that resets each year. Can be used standalone or as a modifier. | All Sample Types and Source Types on the site |
randomId | A four digit random number for each sample row. Note that these random numbers are not guaranteed to be unique. | Current Sample Type |
batchRandomId | A four digit random number applied to the entire set of incoming sample records. On each import event, this random batch number will be regenerated. | Current Sample Type |
now | The current date, which you can format using string formatters. | Current Sample Type |
Inputs | A collection of all DataInputs and MaterialInputs for the current sample. You can concatenate using one or more values from the collection. | Current Sample Type |
DataInputs | A collection of all DataInputs for the current sample. You can concatenate using one or more values from the collection. | Current Sample Type |
MaterialInputs | A collection of all MaterialInputs for the current sample. You can concatenate using one or more values from the collection. | Current Sample Type |
<SomeDataColumn> | Loads data from some field in the data being imported. For example, if the data being imported has a column named "ParticipantID", use the element/token "${ParticipantID}" | Current Sample Type |
Formatting Values
You can use formatting syntax to control how the tokens are added. For example, "${genId}" generates an incrementing counter 1, 2, 3. If you use a format like the following, the incrementing counter will have three digits: 001, 002, 003.
Learn more about formatting numbers and date/time values in this topic:
Date and Number Formats ReferenceAdditional string modifiers are available. Find a list in this topic:
String Expression Format Functions
Default Values
When you are using a data column in your string expression, you can specify a default to use if no value is provided. Use the defaultValue modifier with the following syntax. The 'value' argument provided must be a String in ' single quotes.
${ColumnName:defaultValue('value')}
Incrementing Sample Counters
Some auto-incrementing counters calculate the next value based on all samples and sources across the entire site, while others calculate based on only the current Sample Type in the current container. See the
Scope column for the specific incrementing behavior. When the scope is site-based, within a given container values will be sequential but not necessarily contiguous.
Date-based sample counters are available that will be incremented based on the
date when the sample is inserted. These counters are incrementing, but since they apply to all sample types, within a given Sample Type, values will be sequential but not necessarily contiguous.
- dailySampleCount
- weeklySampleCount
- monthlySampleCount
- yearlySampleCount
All of these counters can be used in either of the following ways:
- As standalone elements of a name expression, i.e. ${dailySampleCount}, in which case they will provide a counter across all sample types and source types based on the date of creation.
- As modifiers of another date column using a colon, i.e. ${SampleDate:dailySampleCount}, in which case the counter applies to the value in the named column ("SampleDate") and not the date of creation.
Do not use both "styles" of date based counter in a single naming expression. While doing so may pass the
name validation step, such patterns will not successfully generate sample names.
:withCounter Modifier
Another alternative for adding a counter to a field is to use
:withCounter, a nested substitution syntax allowing you to add a counter specific to any column value when you add new samples. This modifier is particularly useful when naming aliquots which incorporate the name of the parent sample, and the desire is to provide a counter for only the aliquots of that particular sample. Using :withCounter will guarantee unique values, meaning that if a name with counter would match an existing sample, that counter will be skipped.
The syntax for using :withCounter is to prefix it with an expression that will be evaluated first, then surround the outer modified expression in ${ } brackets so that it too will be evaluated at creation time. The default naming pattern for creating aliquots combines the value in the
AliquotedFrom column (the originating Sample ID), a dash, and a counter specific to that Sample ID:
${${AliquotedFrom}-:withCounter}
You could also use this modifier with another column name as well as strings in the inner expression. Given the example shown for an XML counter in the next section, names like Blood-A-1, Blood-A-2, etc would be generated with this expression:
${Blood-${Lot}-:withCounter}
Learn more about using this syntax in naming patterns for aliquots in the LabKey documentation here:
You can use a starting value and
number formats with this modifier. For example, to have a three digit counter starting at 42, (i.e. S-1-042, S-1-043, etc.) use:
${${AliquotedFrom}-:withCounter(42,'000')}
Names Containing Commas
It is possible to include commas in Sample and Source names, though not a best practice to do so. Commas are used as sample name separators for lists of parent fields, import aliases, etc., so names containing commas have the potential to create ambiguities.
If you do use commas in your names, whether user-provided or LabKey-generated via a naming pattern, consider the following:
- To add or update lineage via a file import, you will need to surround the name in quotes (e.g, "WC-1,3").
- To add two parents, one with a comma, you would only quote the comma-containing name, thus the string would be: "WC-1,3",WC-4.
- If you have commas in names, you cannot use a CSV or TSV file to update values. CSV files interpret the commas as separators and TSV files strip the quotes 'protecting' commas in names as well. Use an Excel file (.xlsx or .xls) when updating data for sample names that may include commas.
Naming Pattern Validation
During creation of a Sample Type, both sample and aliquot naming patterns will be validated. While developing your naming pattern, the admin can hover over the
for a tooltip containing either an example name or an indication of a problem.
When you click
Finish Creating/Updating Sample Type, you will see a banner about any syntax errors and have the opportunity to correct them.
Errors reported include:
- Invalid substitution tokens (i.e. columns that do not exist or misspellings in syntax like ":withCounter").
- Keywords like genId, dailySampleCount, now, etc. included without being enclosed in braces.
- Mismatched or missing quotes, curly braces, and/or parentheses in patterns and formatting.
- Use of curly quotes, when straight quotes are required. This can happen when patterns are pasted from some other applications.
Once a valid naming pattern is defined, users creating new samples or aliquots will be able to see an
Example name in a tooltip both when viewing the sample type details page (as shown above) and when creating new samples in a grid within the Sample Manager and Biologics applications.
Caution: Using Numbers-Only as Sample IDs
Note that while you could create or use sample names that are just strings of digits, you may run into issues if those "number-names" overlap with row numbers of other samples. In such a situation, when there is ambiguity between sample name and row ID, the system will presume that the user intends to use the value as the
name.
Examples
Naming Pattern | Example Output | Description |
---|
S-${genId} | S-101 S-102 S-103 S-104 | S- + a simple sequence |
${Lab:defaultValue('Unknown')}_${genId} | Hanson_1 Hanson_2 Krouse_3 Unknown_4 | The originating Lab + a simple sequence. If the Lab value is null, then use the string 'Unknown'. |
S-${now:date}-${dailySampleCount} | S-20170202-1 S-20170202-2 S-20170202-3 S-20170202-4 | S- + the current date + "-" + daily resetting incrementing integer |
S-${Column1}-${Column2} | S-Plasma-P1 S-Plasma-P2 | Create an id from the letter 'S' and two values from the current row of data, separated by dashes. |
Example String Modifiers
The following naming patterns show usage of
string modifiers.
Naming Pattern00000000000000000000000000000000000000000000000 | Example Output000000000000000000 | Description |
---|
S-${Column1}-${now:date}-${batchRandomId} | S-Blood-20170103-9001 | |
S-${Column1:suffix('-')}${Column2:suffix('-')}${batchRandomId} | S-Blood-PT101-5862 | |
${Column1:defaultValue('S')}-${now:date('yy-MM-dd')}-${randomId} | Blood-17-01-03-2370 S-17-01-03-1166 | ${Column1:defaultValue('S')} means 'Use the value of Column1, but if that is null, then use the default: the letter S' |
${DataInputs:first:defaultValue('S')}-${Column1} | Nucleotide1-5 S-6 | ${DataInputs:first:defaultValue('S')} means 'Use the first DataInput value, but if that is null, use the default: the letter S' |
${DataInputs:join('_'):defaultValue('S')}-${Column1} | Nucleotide1_Nucleotide2-1 | ${DataInputs:join('_'):defaultValue('S')} means 'Join together all of the DataInputs separated by undescores, but if that is null, then use the default: the letter S' |
Related Topics