This topic shows how to register a new protein sequence using the graphical user interface. To register using the API, or to bulk import sequences from an Excel spreadsheet, see Use the Registry API
You can enter the Protein Sequence wizard in a number of ways:
- Via the nucleotide sequence wizard. When registering a nucleotide sequence, you have the option of continuing on to register the corresponding protein sequence.
- Via the header bar. Select Registry > Protein Sequences.
Protein Sequence Wizard
The wizard for registering a new protein sequence proceeds through four tabs:
- Description: (Optional) A text description of the sequence
- Alias: (Optional) List one or more aliases. Type a name, click enter when complete. Continue to add more as needed.
- Protein Sequence Parents: (Optional) List parent component(s) for this sequence. Start typing to narrow the pulldown menu of options.
- Organisms: (Optional) Start typing the organism name to narrow the pulldown menu of options. Multiple values are accepted.
- Seq Part: (Optional) Indicates this sequence can be used as part of a larger sequence. Accepted values are 'Leader', 'Linker', and 'Tag'. When set, chain format must be set to 'SeqPart'.
On the sequence tab, you can translate a protein sequence from a nucleotide sequence as outlined below. If you prefer to manually enter a protein sequence from scratch click Manually add a sequence
at the bottom.
- Nucleotide Sequence: (Optional) The selection made here will populate the left-hand text box with the nucleotide sequence.
- Translation Frame: (Required). The nucleotide sequence is translated into the protein sequence (which will be shown in the right-hand text box) by parsing it into groups of three. The selection of translation frame determines whether the first second or third nucleotide in the series 'heads' the first group of three. Options: 1,2,3.
- Sequence Length: This value is based on the selected nucleotide sequence.
- Nucleotide Start: This value is based on the nucleotide sequence and the translation frame.
- Nucleotide End: This value is based on the nucleotide sequence and the translation frame.
- Translated Sequence Length: This value is based on the nucleotide sequence and the translation frame.
- Protein Start: Specific the start location of the protein to be added to the registry.
- Protein End: Specific the end location of the protein to be added to the registry.
The annotations tab displays any matching annotations found in the annotation library. You can also add annotations manually at this point in the registration wizard.
- Name: a freeform name
- Type: for example, Leader, Variable, Tag, etc. Start typing to narrow the menu options.
- Category: 'Feature' or 'Region'
- Description: (Optional)
- Start and End Positions: 1-based offsets within the sequence
Editing is not allowed at this point, but you can edit annotations after the registration wizard is complete.
Suggested annotations can be “removed” by clicking the red
icons in the grid panel. They can also be added back using the green
icon if the user changes their mind.
For complete details on using the annotation panel see Protein Sequence Annotations
to continue the wizard.
- Chain Format: select a chain format from the dropdown (start typing to filter the list of options). An administrator defines the set of options on the ChainFormats list. LabKey Biologics will attempt to classify the protein's chain format if possible.
- ε: the extinction coefficient
- Avg. Mass The average mass
- Num. S-S The number of disulfide bonds
- pI The isoelectric point
- Num Cys. The number of cysteine elements
Default or best guess values may prepopulate the wizard, but can be edited as needed.
The Confirm panel provides a summary of the protein about to be added to the registry.
to add the protein to the registry.
Editing Protein Sequence Fields
Once you have defined a protein sequence, you can locate it on the lists and reopen to see the details. Some fields are eligible for editing. Those that are "in use" by the system or other entities cannot be changed. All edits are logged.