The Biologics LIMS Registry includes a Compound
registry source type to represent data in the form of Simplified Molecular Input Line Entry System (SMILES)
strings, their associated 2D structures, and calculated physical properties. This data is stored in LabKey Biologics not as a system of record, but to support analysis needs of scientists receiving unfamiliar material, analytical chemists, structural biologists, and project teams.
When a user is viewing a Compound in the Bioregistry, they can access 2D chemical structure images and basic calculations like molecular weight. A field of the custom type "SMILES" takes string input and returns the an associated 2D image file and calculations to be stored as part of the molecule and displayed in registry grids.
The SMILES information succinctly conveys useful information about the structure(s) received when shared with others, helping structural biologists quickly view/reference the Compound structure and properties while trying to model a ligand. Analytical chemists can use the Compound calculated physical properties for accurate measurements and calculations. For many project team members, the SMILES structure is often used in reports and presentations as well as to plan future work.
SMILES Lookup Field
registry source type uses a custom SMILES
field type only available in the Biologics module for this specific registry source. This datatype enables users to provide a SMILES string, e.g. "C1=CC=C(C=C1)C=O", that will return a 2D structure image, molecular weight and other computed properties.
The SMILES string is used to search the Java library CDK ("Chemistry Development Kit")
, a set of open source modular Java libraries for Cheminformatics. This library is used to generate the Structure2D
image and calculate masses.
New Compounds can be created manually or via file import
. It's easier to get started understanding the lookup process by creating a single compound in the user interface.
Create Compound: Water
As an example, you could create a new compound, supplying the SMILES string "O=C=O" and the Common Name "carbon dioxide".
The SMILES string will be used to populate the Structure2D, Molecular Formula, Average Mass, and Monoisotopic Mass
Click the thumbnail for a larger version of the Structure2D
image. You can download the image from the three-dot menu.
Import File of SMILES Strings
When a file of SMILES strings is Created/imported, each is used to query for the respective 2D structure, molecular weight and set of computed properties. During the Create/Import operation if Name/ID isn’t specified, the SMILES string is used for Name.