The LabKey Server platform uses the emerging LSID standard (http://www.omg.org/cgi-bin/doc?dtc/04-05-01.pdf) for identifying entities in the database, such as experiment and protocol definitions. LSIDs are a specific form of URN (Universal Resource Name). Entities in the database will have an associated LSID field that contains a unique name to identify the entity.

Constructing LSIDs

LSIDs are multi-part strings with the parts separated by colons. They are of the form:

urn:lsid:<AuthorityID>:<NamespaceID>:<ObjectID>:<RevisionID>

The variable portions of the LSID are set as follows:

  • <AuthorityID>: An Internet domain name
  • <NamespaceID>: A namespace identifier, unique within the authority
  • <ObjectID>: An object identifier, unique within the namespace
  • <RevisionID>: An optional version string
An example LSID might look like the following:

urn:lsid:genologics.com:Experiment.pub1:Project.77.3

LSIDs are a solution to a difficult problem: how to identify entities unambiguously across multiple systems. While LSIDs tend to be long strings, they are generally easier to use than other approaches to the identifier problem, such as large random numbers or Globally Unique IDs (GUIDs). LSIDs are easier to use because they are readable by humans, and because the LSID parts can be used to encode information about the object being identified.

Note: Since LSIDs are a form of URN, they should adhere to the character set restrictions for URNs (see http://www.zvon.org/tmRFC/RFC2141/Output/index.html). LabKey Server complies with these restrictions by URL encoding the parts of an LSID prior to storing it in the database. This means that most characters other than letters, numbers and the underscore character are converted to their hex code format. For example, a forward slash "/" becomes "%2F" in an LSID. For this reason it is best to avoid these characters in LSIDs.

The LabKey Server system both generates LSIDs and accepts LSID-identified data from other systems. When LSIDs are generated by other systems, LabKey Server makes no assumptions about the format of the LSID parts. External LSIDs are treated as an opaque identifier to store and retrieve information about a specific object. LabKey Server does, however, have specific uses for the sub-parts of LSIDs that are created on the LabKey Server system during experiment load.

Once issued, LSIDs are intended to be permanent. The LabKey Server system adheres to this rule by creating LSIDs only on insert of new object records. There is no function in LabKey Server for updating LSIDs once created. LabKey Server does, however, allow deletion of objects and their LSIDs.

AuthorityID

The Authority portion of an LSID is akin to the "issuer" of the LSID. In LabKey Server, the default authority for LSIDs created by the LabKey Server system is set via the Customize Site page on the Admin Console page. Normally this should be set to the host portion of the address by which users connect to the LabKey Server instance, such as proteomics.fhcrc.org.

Note: According to the LSID specification, an Authority is responsible for responding to metadata queries about an LSID. To do this, an Authority would implement an LSID resolution service, of which there are three variations. The LabKey Server system does not currently implement a resolution service, though the design of LabKey Server is intended to make it straightforward to build such a service in the future.

NamespaceID

The Namespace portion of an LSID specifies the context in which a particular ObjectID is unique. Its uses are specific to the authority. LSIDs generated by the LabKey Server system use this portion of the LSID to designate the base object type referred to by the LSID (for example, Material or Protocol.) LabKey LSIDs also usually append a second namespace term (a suffix) that is used to ensure uniqueness when the same object might be loaded multiple times on the same LabKey Server system. Protocol descriptions, for example, often have a folder scope LSID that includes a namespace suffix with a number that is unique to the folder in which the protocol is loaded.

ObjectID

The ObjectID part of an LSID is the portion that most closely corresponds to the "name" of the object. This portion of the LSID is entirely up to the user of the system. ObjectIDs often include usernames, dates, or file names so that it is easier for users to remember what the LSID refers to. All objects that have LSIDs also have a Name property that commonly translates into the ObjectID portion of the LSID. The Name property of an object serves as the label for the object on most LabKey Server pages. It's a good idea to replace special characters such as spaces and punctuation characters with underscores or periods in the ObjectID.

RevisionID

LabKey Server does not currently generate RevisionIDs in LSIDs, but can accept LSIDs that contain them.

LSID Example

Here is an example of a valid LabKey LSID:

urn:lsid:labkey.org:Protocol.Folder-2994:SamplePrep.Biotinylation

This LSID identifies a specific protocol for a procedure called biotinylation. This LSID was created on a system with the LSID authority set to labkey.org. The namespace portion indicates that Protocol is the base type of the object, and the suffix value of Folder-2994 is added so that the same protocol can be loaded in multiple folders without a key conflict (see the discussion on substitution templates below). The ObjectId portion of the LSID can be named in whatever way the creator of the protocol chooses. In this example, the two-part ObjectId is based on a sample preparation stage (SamplePrep), of which one specific step is biotinylation (Biotinylation).

Discussion

previousnext
 
expand all collapse all