Sample Naming Patterns: /Documentation/Archive/25.3

Sample Naming Patterns

Documentation: Version 25.3

Sample Naming Patterns

Each sample in a Sample Type must have a unique name/id. If your data does not already contain unique identifiers for each row, the system can generate them upon import using a Naming Pattern (previously referred to as a name expression) as part of the Sample Type definition.

Overview and Basic Examples

Administrators can customize how sample IDs are generated using templates that build a unique name from syntax elements including:

String constants
Incrementing numbers
Dates and partial dates
Values from the imported data file, such as tissue types, lab names, subject ids, etc.

For example, this simple naming pattern uses a string constant 'S', a dash, and an incrementing number:

S-${genId}

...to generate sample IDs:

S-1
S-2
S-3
and so on...

This example builds a sample ID from values in the imported data file, and trusts that the combination of these column values is always unique in order to keep sample IDs unique:

${ParticipantId}_${DrawDate}_${PortionNumber}

...which might generate IDs like:

123456_2021-2-28_1
123456_2021-2-28_2
123456_2021-2-28_3

Naming Pattern Elements/Tokens

Naming patterns can incorporate the following elements. When surrounded by ${ } syntax, they will be substituted during sample creation; most can also accommodate various formatting options detailed below.

Name Element	Description	Scope0000000000000000000000000
genId	An incrementing number starting from 1. This counter is specific to the individual Sample Type or Data Class. Not guaranteed to be continuous.	Current Sample Type / Data Class in a given container
sampleCount	A counter incrementing for all samples of all Sample Types, including aliquots. Not guaranteed to be continuous. Learn more below.	All Sample Types in the application (top level project plus any subfolders)
rootSampleCount	A counter incrementing for non-aliquot samples of all Sample Types. Not guaranteed to be continuous. Learn more below.	All Sample Types in the application (top level project plus any subfolders)
now	The current date, which you can format using string formatters.	Current Sample Type / Data Class
<SomeDataColumn>	Loads data from some field in the data being imported. For example, if the data being imported has a column named "ParticipantID", use the element/token "${ParticipantID}"	Current Sample Type / Data Class
dailySampleCount	An incrementing counter, starting with the integer '1', that resets each day. Can be used standalone or as a modifier.	All Sample Types / Data Classes on the site
weeklySampleCount	An incrementing counter, starting with the integer '1', that resets each week. Can be used standalone or as a modifier.	All Sample Types / Data Classes on the site
monthlySampleCount	An incrementing counter, starting with the integer '1', that resets each month. Can be used standalone or as a modifier.	All Sample Types / Data Classes on the site
yearlySampleCount	An incrementing counter, starting with the integer '1', that resets each year. Can be used standalone or as a modifier.	All Sample Types / Data Classes on the site
randomId	A four digit random number for each sample row. Note that this random number is not guaranteed to be unique.	Current Sample Type / Data Class
batchRandomId	A four digit random number applied to the entire set of incoming sample records. On each import event, this random batch number will be regenerated.	Current Sample Type / Data Class
Inputs	A collection of all DataInputs and MaterialInputs for the current sample. You can concatenate using one or more values from the collection.	Current Sample Type / Data Class
DataInputs	A collection of all DataInputs for the current sample. You can concatenate using one or more values from the collection.	Current Sample Type / Data Class
MaterialInputs	A collection of all MaterialInputs for the current sample. You can concatenate using one or more values from the collection.	Current Sample Type / Data Class

Format Number Values

You can use formatting syntax to control how the tokens are added. For example, "${genId}" generates an incrementing counter 1, 2, 3. If you use a format like the following, the incrementing counter will have three digits: 001, 002, 003.

${genId:number('000')}

Learn more about formatting numbers and date/time values in this topic: Date & Number Display Formats

Include Default Values

When you are using a data column in your string expression, you can specify a default to use if no value is provided. Use the defaultValue string modifier with the following syntax. The 'value' argument provided must be a String in ' single straight quotes. Double quotes or curly quotes will cause an error during sample creation.

${ColumnName:defaultValue('value')}

Use String Modifiers

Most naming pattern elements can be modified with string modifiers, including but not limited to:

:date - Use only the date portion of a datetime value.
:date('yy-MM-dd') - Use only the date portion of a datetime value and format it with the given format.
:suffix('-') - Apply the suffix shown in the argument if the value is not null.
:join('_') - Combine a collection of values with the given argument as separator.
:first - Use the first of a series of values.

The supported options are described in this topic:

String Expression Format Functions

Some string modifier examples:

Naming Pattern00000000000000000000000000000000000000	Example Output000000000000000000	Description
S-${Column1}-${now:date}-${batchRandomId}	S-Blood-20170103-9001
S-${Column1:suffix('-')}${Column2:suffix('-')}${batchRandomId}	S-Blood-PT101-5862
${Column1:defaultValue('S')}-${now:date('yy-MM-dd')}-${randomId}	Blood-17-01-03-2370 S-17-01-03-1166	${Column1:defaultValue('S')} means 'Use the value of Column1, but if that is null, then use the default: the letter S'
${DataInputs:first:defaultValue('S')}-${Column1}	Nucleotide1-5 S-6	${DataInputs:first:defaultValue('S')} means 'Use the first DataInput value, but if that is null, use the default: the letter S'
${DataInputs:join('_'):defaultValue('S')}-${Column1}	Nucleotide1_Nucleotide2-1	${DataInputs:join('_'):defaultValue('S')} means 'Join together all of the DataInputs separated by underscores, but if that is null, then use the default: the letter S'

Incorporate Lineage Elements

To include properties of sample sources or parents in sample names, use lookups into the lineage of the sample.

Specific data type inputs: MaterialInputs/SampleType1/propertyA, DataInputs/DataClass2/propertyA, etc.
Import alias references: parentSampleA/propertyA, parentSourceA/propertyA, etc.
In some scenarios, you may be able to use a shortened syntax referring to an unambiguous parent property: Inputs/propertyA, MaterialInputs/propertyA, DataInputs/propertyA

This option is not recommended and can only be used when 'propertyA' only exists for a single type of parent. Using a property common to many parents, such as 'Name' will produce unexpected results.

For example, to include source metadata (e.g. my blood sample was derived from this mouse, I would like to put the mouse strain in my sample name), the derived sample's naming pattern might look like:

Blood-${DataInputs/Mouse/Strain}-${genId}

You can use the qualifier :first to select the first of a given set of inputs when there might be several.

If there might be multiple parent samples, you could choose the first one in a naming pattern like this:

Blood-${parentBlood/SampleID:first}-${genId}

Include Ancestor Names/Properties Using "~"

To reference an ancestor name (or other property), regardless of the depth of the ancestry, you can use syntax that includes a tilde and will 'walk the lineage tree' to the named Source or Sample Type regardless of depth of a lineage tree (up to a maximum depth of 20). Note that this type of syntax may be more resource intensive, so if you know that the ancestor will always be the direct parent or at another specific/consistent level, you should use another lineage lookup for efficiency.

For example, consider a "Participant" Source Type and also a Sample Type like "Blood" that could be either a direct 'child' of the source, or a grandchild (of an intermediate sample like "Tissue"), or any further descendent. You can include properties of the Participant source of a "Blood" sample with a naming pattern like this:

${~DataInputs/Participant/Name}
${~DataInputs/Participant/OtherProperty}

Similarly, for ancestor Sample Types at any depth, use syntax like this:

${~MaterialInputs/SampleTypeName/Name}
${~MaterialInputs/SampleTypeName/OtherProperty}

This syntax can be combined with other naming pattern elements, including counters as shown in this example. This will maintain a counter per Participant, regardless of the depth of tree where the sample is created:

${${~DataInputs/Participant/Name}-:withCounter}

Note that if the ancestor type appears multiple times in the lineage for a given sample, the "furthest" ancestor will be used.

Include Grandparent Names/Properties Using ".."

If you know that the ancestor of interest is a fixed number of generations above the direct parent, i.e. the grandparent or great-grandparent generation, you can use ".." syntax to walk the tree to that specific level. Here are a few examples of syntax for retrieving names from the ancestry of a sample. For simplicity, each of these examples is shown followed by a basic ${genId} counter, but you can incorporate this syntax with other elements. Note that the shortened syntax available for first generation "parent" lineage lookup is not supported here. You must specify both the sample type and the "/name" field to use.

To use the name from a specific grandparent sample type, use two levels:

${MaterialInputs/CurrentParent/..[MaterialInputs/GrandParentSampleType]/name}-${genId}

To use another propertyColumn from a specific grandparent sample type:

${MaterialInputs/CurrentParent/..[MaterialInputs/GrandParentSampleType]/propertyColumn}-${genId}

You can use a parent alias for the immediate parent level, but not for any grandparent sample types or fields:

${parentAlias/..[MaterialInputs/GrandParentSampleType]/name}-${genId}

Compound this syntax to further "walk" the lineage tree to use a great grand parent sample type and field:

${MaterialInputs/CurrentParent/..[MaterialInputs/GrandParentSampleType]/..[MaterialInputs/GreatGrandSampleType]/greatGrandFieldName}-${genId}

To define a naming pattern that uses the name of the grandparent of any type, you can omit the grandparent sample type name entirely. For example, if you had Plasma samples that might have any number of grandparent types, you could use the grandparent name using syntax like any of the following:

${plasmaParent/..[MaterialInputs]/name}-${genId}
${MaterialInputs/Plasma/..[MaterialInputs]/name}-${genId}
${MaterialInputs/..[MaterialInputs]/name}-${genId}

Names Containing Commas

It is possible to include commas in Sample and Bioregistry (Data Class) entity names, though not a best practice to do so. Commas are used as sample name separators for lists of parent fields, import aliases, etc., so names containing commas have the potential to create ambiguities.

If you do use commas in your names, whether user-provided or LabKey-generated via a naming pattern, consider the following:

To add or update lineage via a file import, you will need to surround the name in quotes (e.g, "WC-1,3").
To add two parents, one with a comma, you would only quote the comma-containing name, thus the string would be: "WC-1,3",WC-4.
If you have commas in names, you cannot use a CSV or TSV file to update values. CSV files interpret the commas as separators and TSV files strip the quotes 'protecting' commas in names as well. Use an Excel file (.xlsx or .xls) when updating data for sample names that may include commas.

Naming Pattern Examples

Naming Pattern	Example Output	Description
${genId}	1 2 3 4	a simple sequence
S-${genId}	S-1 S-2 S-3 S-1	S- + a simple sequence
${Lab:defaultValue('Unknown')}_${genId}	Hanson_1 Hanson_2 Krouse_3 Unknown_4	The originating Lab + a simple sequence. If the Lab value is null, then use the string 'Unknown'.
S_${randomId}	S_3294 S_1649 S_9573 S_8843	S_ + random numbers
S_${now:date}_${dailySampleCount}	S_20170202_1 S_20170202_2 S_20170202_3 S_20170202_4	S_ + the current date + daily resetting incrementing integer
S_${Column1}_${Column2}	S_Blood_PT101	Create an id from the letter 'S' and two values from the current row of data, separated by underscore characters.
${OriginalCellLine/Name}-${genId}	CHO-1, CHO-2	Here OriginalCellLine is a lookup field to a table of orginating cell lines. Use a slash to 'walk the lookup'. Useful when you are looking up an integer-keyed table but want to use a different display field.

Incorporate Sample Counters

Sample counters can provide an intuitive way to generate unique sample names. Options include using a site-wide, container-wide, date-specific, or other field value-specific counter for your sample names.

See and Set genId

genId is an incrementing counter which starts from 1 by default, and is maintained internally for each Sample Type or Data Class in each container. Note that while this value will increment, it is not guaranteed to be continuous. For example, any creations by file import will 'bump' the value of genId by 100, as the system is "holding" a batch of values to use.

When you include ${genId} in a naming pattern, you will see a blue banner indicating the current value of genId.

If desired, click Edit genId to set it to a higher value than it currently is. This action will reset the counter and cannot be undone.

As a special case, before any samples have been created, you'll see a Reset GenId as well as the Edit GenId button. This can be used to revert the change to GenID, but note that it is not available once any samples of this type have been created.

sampleCount Token

When you include ${sampleCount} as a token in your naming pattern, it will be incremented for every sample created in the application (a top-level project and any sub-projects it contains), including aliquots. This counter value is stored internally and continuously increments, regardless of whether it is used in naming patterns for the created samples.

For example, consider a system of naming patterns where the Sample Type (Blood, DNA, etc) is followed by the sampleCount token for all samples, and the default aliquot naming pattern is used. For "Blood" this would be:

Blood-${sampleCount}
${${AliquotedFrom}-:withCounter}

A series of new samples and aliquots using such a scheme might be named as follows:

Sample/Aliquot	Name	value of the sampleCount token
sample	Blood-1000	1000
aliquot	Blood-1000-1	1001
aliquot	Blood-1000-2	1002
sample	DNA-1003	1003
aliquot	DNA-1003-1	1004
sample	Blood-1005	1005

If desired, you could also use the sampleCount in the name of aliquots directly rather than incorporating the "AliquotedFrom" sample name in the aliquot name. For Blood, for example, the two naming patterns could be the same:

Blood-${sampleCount}  <- for samples
Blood-${sampleCount}  <- for aliquots

In this case, the same series of new samples and aliquots using this naming pattern convention would be named as follows:

Sample/Aliquot	Name
sample	Blood-1000
aliquot	Blood-1001
aliquot	Blood-1002
sample	DNA-1003
aliquot	DNA-1004
sample	Blood-1005

The count(s) stored in the sampleCount and rootSampleCount tokens are not guaranteed to be continuous or represent the total count of samples, much less the count for a given Sample Type, for a number of reasons including:

Addition of samples (and/or aliquots) of any type anywhere in the application will increment the token(s).
Any failed sample import would increment the token(s).
Import using merge would increment the token(s) by more than by the new number of samples. Since we cannot tell if an incoming row is a new or existing sample for merge, the counter is incremented for all rows.

Administrators can see the current value of the sampleCount token on the Administration > Settings tab. A higher value can also be assigned to the token if desired. You could also use the :minValue modifier in a naming pattern to reset the count to a higher value.

rootSampleCount Token

When you include ${rootSampleCount} as a token in your naming pattern, it will be incremented for every non-aliquot (i.e. root) sample created in the application (a top-level project and any subprojects it contains). Creation of aliquots will not increment this counter, but creation of any Sample of any Sample Type will increment it.

For example, if you use the convention of using the Sample Type (Blood, DNA, etc.) followed by the rootSampleCount token for all samples, and the default aliquot naming pattern, for "Blood" this would be:

Blood-${rootSampleCount}
${${AliquotedFrom}-:withCounter}

A series of new samples and aliquots using this convention for all types might be named as follows:

Sample/Aliquot	Name	value of rootSampleCount
sample	Blood-100	100
aliquot	Blood-100-1	100
aliquot	Blood-100-2	100
sample	DNA-101	101
aliquot	DNA-101-1	101
sample	Blood-102	102

The count stored in the rootSampleCount token is not guaranteed to be continuous or represent the total count of root samples for the same reasons as enumerated above for the sampleCount token.

Administrators can see the current value of the rootSampleCount token on the Administration > Settings tab. A higher value can also be assigned to the token if desired. You could also use the :minValue modifier in a naming pattern to reset the count to a higher value.

Date Based Counters

Several sample counters are available that will be incremented based on the date when the sample is inserted. These counters are incrementing, but since they apply to all sample types and data classes across the site, within a given sample type or data class, values will be sequential but not necessarily contiguous.

dailySampleCount
weeklySampleCount
monthlySampleCount
yearlySampleCount

All of these counters can be used in either of the following ways:

As standalone elements of a naming pattern, i.e. ${dailySampleCount}, in which case they will provide a counter across all sample types and data classes based on the date of creation.
As modifiers of another date column using a colon, i.e. ${SampleDate:dailySampleCount}, in which case the counter applies to the value in the named column ("SampleDate") and not the date of creation.

Do not use both "styles" of date based counter in a single naming pattern. While doing so may pass the name validation step, such patterns will not successfully generate sample names.

:withCounter Modifier

Another alternative for adding a counter to a field is to use :withCounter, a nested substitution syntax allowing you to add a counter specific to another column value or combination of values. Using :withCounter will always guarantee unique values, meaning that if a name with the counter would match an existing sample (perhaps named in another way), that counter will be skipped until a unique name can be generated.

The nested substitution syntax for using :withCounter is to attach it to an expression (such as a column name) that will be evaluated/substituted first, then surround the outer modified expression in ${ } brackets so that it too will be evaluated at creation time.

This modifier is particularly useful when naming aliquots which incorporate the name of the parent sample, and the desire is to provide a counter for only the aliquots of that particular sample. The default naming pattern for creating aliquots combines the value in the AliquotedFrom column (the originating Sample ID), a dash, and a counter specific to that Sample ID:

${${AliquotedFrom}-:withCounter}

You could also use this modifier with another column name as well as strings in the inner expression. For example, if a set of Blood samples includes a Lot letter in their name, and you want to add a counter by lot to name these samples, names like Blood-A-1, Blood-A-2, Blood-B-1, Blood-B-2, etc. would be generated with this expression. The string "Blood" is followed by the value in the Lot column. This combined expression is evaluated, and then a counter is added:

${Blood-${Lot}-:withCounter}

Use caution to apply the nested ${ } syntax correctly. The expression within the brackets that include the :withCounter modifier is all that it will be applied to. If you had a naming pattern like the following, it looks similar to the above, but would only 'count' the number of times the string "Blood" was in the naming pattern, ignoring the Lot letter, i.e. "A-Blood-1, A-Blood-2, B-Blood-3, B-Blood-4:

${Lot}-${Blood-:withCounter}

This modifier can be applied to a combination of column names. For example, if you wanted a counter of the samples taken from a specific Lot on a specific Date (using only the date portion of a "Date" value, you could obtain names like 20230522-A-1, 20230522-A-2, 20230523-A-1, etc. with a pattern like:

${${Date:date}-${Lot}-:withCounter}

You can further provide a starting value and number format with this modifier. For example, to have a three digit counter starting at 42, (i.e. S-1-042, S-1-043, etc.) use:

${${AliquotedFrom}-:withCounter(42,'000')}

:minValue Modifier

Tokens including genId, sampleCount, and rootSampleCount can be reset to a new higher 'base' value by including the :minValue modifier. For example, to reset sampleCount to start counting at a base of 100, use a naming pattern with the following syntax:

S-${sampleCount:minValue(100)}

If you wanted to also format the count value, you could combine the minValue modifier with a number formatter like this, to make the count start from 100 and be four digits:

S-${sampleCount:minValue(100):number('0000')}

Note that once you've used this modifer to set a higher 'base' value for genId, sampleCount, or rootSampleCount, that value will be 'sticky' in that the internally stored counter will be set at that new base. If you later remove the minValue modifier from the naming pattern, the count will not 'revert' to any lower value. This behavior does not apply to using the :minValue modifier on other naming pattern tokens, where the other token will not retain or apply the previous higher value if it is removed from the naming pattern.

Advanced Options: XML Counter Columns

Another way to create a sequential counter based on values in other columns, such as an incrementing series for each lot of material on a plate, is to use XML metadata to define a new "counter" column paired with one or more existing columns to provide and store this incrementing value. You then use the new column in your naming pattern.

For example, consider that you would like to generate the following sample type for a given plate:

Name (from expression)	Lot	SampleInLot	Plate Id
Blood-A-1	A	1	P1234
Blood-A-2	A	2	P1234
Blood-B-1	B	1	P1234
Blood-B-2	B	2	P1234

The "SampleInLot" column contains a value that increments from 1 independently for each value in the "Lot" column. In this example, the naming pattern would be S-${Lot}-${SampleInLot}. To define the "SampleInLot" column, you would create it with data type integer, then use XML metadata similar to this:

<tables xmlns="http://labkey.org/data/xml">
  <table tableName="MySampType" tableDbType="NOT_IN_DB">
    <javaCustomizer class="org.labkey.experiment.api.CountOfUniqueValueTableCustomizer">
        <properties>
            <property name="counterName">SampleCounter</property>
            <property name="counterType">org.labkey.api.data.UniqueValueCounterDefinition</property>
            <!-- one or more pairedColumns used to derive the unique value -->
            <property name="pairedColumn">Lot</property>
            <!-- one or more attachedColumns where the incrementing counter value is placed -->
            <property name="attachedColumn">SampleInLot</property>
        </properties>
    </javaCustomizer>
  </table>
</tables>

The following rules apply to these XML counters:

The counter you create is scoped to a single Sample Type.
The definition of the counter column uses one or more columns to determine the unique value. When you use the counter in a naming pattern, be sure to also include all the paired columns in addition to your counter column to ensure the sample name generated will be unique.
The incrementing counter is maintained within the LabKey database.
Gaps in the counter's sequence are possible because it is incremented outside the transaction. It will be sequential, but not necessarily contiguous.
When you insert into a sample type with a unique counter column defined, you can supply a value for this unique counter column, but the value must be equal to or less than the current counter value for the given paired column.

Naming Pattern Validation

During creation of a Sample Type, both sample and aliquot naming patterns will be validated. While developing your naming pattern, the admin can hover over the for a tooltip containing either an example name or an indication of a problem.

When you click Finish Creating/Updating Sample Type, you will see a banner about any syntax errors and have the opportunity to correct them.

Errors reported include:

Invalid substitution tokens (i.e. columns that do not exist or misspellings in syntax like ":withCounter").
Keywords like genId, dailySampleCount, now, etc. included without being enclosed in braces.
Mismatched or missing quotes, curly braces, and/or parentheses in patterns and formatting.
Use of curly quotes, when straight quotes are required. This can happen when patterns are pasted from some other applications.

Once a valid naming pattern is defined, users creating new samples or aliquots will be able to see an Example name in a tooltip both when viewing the sample type details page (as shown above) and when creating new samples in a grid within the Sample Manager and Biologics applications.

Caution: When Names Consist Only of Digits

Note that while you could create or use names/naming patterns that result in only digits, you may run into issues if those "number-names" overlap with row numbers of other entities. In such a situation, when there is ambiguity between name and row ID, the system will presume that the user intends to use the value as the name.