The Biologics LIMS Registry includes a
Compound registry source type to represent data in the form of
Simplified Molecular Input Line Entry System (SMILES) strings, their associated 2D structures, and calculated physical properties. This data is stored in LabKey Biologics not as a system of record, but to support analysis needs of scientists receiving unfamiliar material, analytical chemists, structural biologists, and project teams.
When a user is viewing a Compound in the Bioregistry, they can access 2D chemical structure images and basic calculations like molecular weight. A field of the custom type "SMILES" takes string input and returns the an associated 2D image file and calculations to be stored as part of the molecule and displayed in registry grids.
The SMILES information succinctly conveys useful information about the structure(s) received when shared with others, helping structural biologists quickly view/reference the Compound structure and properties while trying to model a ligand. Analytical chemists can use the Compound calculated physical properties for accurate measurements and calculations. For many project team members, the SMILES structure is often used in reports and presentations as well as to plan future work.
SMILES Lookup Field
The
Compound registry source type uses a custom
SMILES field type only available in the Biologics module for this specific registry source. This datatype enables users to provide a SMILES string, e.g. "C1=CC=C(C=C1)C=O", that will return a 2D structure image, molecular weight and other computed properties.
The SMILES string is used to search the Java library
CDK ("Chemistry Development Kit") , a set of open source modular Java libraries for Cheminformatics. This library is used to generate the
Structure2D image and calculate masses.
Create/Import Compounds
New Compounds can be
created manually or via file import. It's easier to get started understanding the lookup process by creating a single compound in the user interface.
Create Compound: Carbon Dioxide
As an example, you could create a new compound, supplying the SMILES string "O=C=O" and the Common Name "carbon dioxide".
The SMILES string will be used to populate the
Structure2D, Molecular Formula, Average Mass, and Monoisotopic Mass columns.
Click the thumbnail for a larger version of the
Structure2D image. You can download the image from the three-dot menu.
Import File of SMILES Strings
When a file of SMILES strings is Created/imported, each is used to query for the respective 2D structure, molecular weight and set of computed properties. During the Create/Import operation if Name/ID isn’t specified, the SMILES string is used for Name.
Related Topics