Compounds and SMILES Lookups

2024-03-28

Premium Feature — Available with LabKey Biologics LIMS. Learn more or contact LabKey.

The Biologics LIMS Registry includes a Compound registry source type to represent data in the form of Simplified Molecular Input Line Entry System (SMILES) strings, their associated 2D structures, and calculated physical properties. This data is stored in LabKey Biologics not as a system of record, but to support analysis needs of scientists receiving unfamiliar material, analytical chemists, structural biologists, and project teams.

When a user is viewing a Compound in the Bioregistry, they can access 2D chemical structure images and basic calculations like molecular weight. A field of the custom type "SMILES" takes string input and returns the an associated 2D image file and calculations to be stored as part of the molecule and displayed in registry grids.

The SMILES information succinctly conveys useful information about the structure(s) received when shared with others, helping structural biologists quickly view/reference the Compound structure and properties while trying to model a ligand. Analytical chemists can use the Compound calculated physical properties for accurate measurements and calculations. For many project team members, the SMILES structure is often used in reports and presentations as well as to plan future work.

SMILES Lookup Field

The Compound registry source type uses a custom SMILES field type only available in the Biologics module for this specific registry source. This datatype enables users to provide a SMILES string, e.g. "C1=CC=C(C=C1)C=O", that will return a 2D structure image, molecular weight and other computed properties.

The SMILES string is used to search the Java library CDK ("Chemistry Development Kit") , a set of open source modular Java libraries for Cheminformatics. This library is used to generate the Structure2D image and calculate masses.

Create/Import Compounds

New Compounds can be created manually or via file import. It's easier to get started understanding the lookup process by creating a single compound in the user interface.

Create Compound: Carbon Dioxide

As an example, you could create a new compound, supplying the SMILES string "O=C=O" and the Common Name "carbon dioxide".

The SMILES string will be used to populate the Structure2D, Molecular Formula, Average Mass, and Monoisotopic Mass columns.

Click the thumbnail for a larger version of the Structure2D image. You can download the image from the three-dot menu.

Import File of SMILES Strings

When a file of SMILES strings is Created/imported, each is used to query for the respective 2D structure, molecular weight and set of computed properties. During the Create/Import operation if Name/ID isn’t specified, the SMILES string is used for Name.

Related Topics