This topic shows how to register a new nucleotide sequence using the graphical user interface. To register using the API, or to bulk import sequences from an Excel spreadsheet, see Use the Registry API
Nucleotide Sequence Validation
For nucleotide sequences, we allow DNA and RNA bases (ACTGU) as well as the IUPAC notation
degenerate bases (WSMKRYBDHVNZ). On import, whitespace will be removed from a nucleotide sequence. If the sequence contains other letters or symbols, an error will be raised.
For protein sequences, we only allow standard amino acids letters and zero or more trailing stop codon '*'. On import, whitespace will be removed from a protein sequence. If the sequence contains stop codons in the middle of the sequence or a other letters or symbols, an error will be raised.
When translating a nucleotide codon triple to a protein sequence, where the codon contains one or more of the degenerate bases, the system attempts to find a single amino acid that could be mapped to by all of the possible nucleotide combinations for that codon. If a single amino acid is found, it will be used in the translated protein. If not, the codon will be translated as an 'X'.
For example, the nucleotide sequence 'AAW' is ambiguous since it could map to either 'AAA' or 'AAT' (representing Lysine and Asparagine respectively), so 'AAW' will be translated as an 'X' However, 'AAR' maps to either 'AAA" or 'AAG' which are both are translated to Lysine, so it will be translated as a 'K'.
Register a Nucleotide Sequence
To add a new nucleotide sequence to the registry:
From the header bar, select Registry > Nucleotide Sequences
On the Nucleotide Sequences
page, select Create > Nucleotide Sequence
Nucleotide Sequence Wizard
The wizard has two tabs:
- On the Register a new Nucleotide Sequence page, in the Details panel, enter the following:
- Description: (Optional) A text description of the sequence.
- Alias: (Optional) Alternative names for the sequence. Type a name, click enter when complete. Continue to add more as needed.
- Nucleotide Sequence Parents: (Optional) Parent components. A related sequence the new sequence is derived from, for example, related as a mutation. You can select more than one parent. Start typing to narrow the pulldown menu of options.
- Sequence: (Required) The nucleotide sequence
- Annotations: (Optional) A comma separated list of annotation information:
- Name - a freeform name
- Category - region or feature
- Type - for example, Leader, Variable, Tag, etc.
- Start and End Positions are 1-based offsets within the sequence.
- Click Next.
- To register the nucleotide and register the corresponding protein, click Finish and translate protein. This option will take you to the registry wizard for a new protein, prepopulating it with the protein sequence based on the nucleotide.
- To register the nucleotide and finish, click Finish.