Accurate and consistent user entry is important to assay data, especially when it includes manual input of key metadata. For example, if an operator failed to enter a needed instrument setting, and later someone else wants to recreate an interesting result, it can be impossible to determine how that result was actually obtained. If an instrument's brand name is entered where a serial number is expected, results from different machines can be erroneously grouped as if they came from a single machine. If one machine is found to be faulty, you may be forced to throw out all data if you haven't accurately tracked where each run was done.
This topic demonstrates a few of the options available for data validation during upload:
Set Up Validation
Here we add some validation to our GenericAssay design by modifying it. Remember that the assay design is like a map describing how to import and store data. When we change the map, any run data imported using the old design may no longer pass validation.
Open the design for editing:
- Navigate to the Assay Tutorial folder.
- In the Assay List web part, click GenericAssay.
- Select Manage Assay Design > Edit assay design.
Note that if you didn't specify the current (tutorial) subfolder when you defined this assay, you will get a pop up dialog "This assay is defined in the <PROJECT_NAME> folder. Would you still like to edit it?". Click Ok to continue to the Assay Designer if you are the only user of this assay in this project, otherwise you will need to copy the assay design to the current tutorial folder before proceeding to edit your copy.
By default, any new field you add to an assay design is optional. If you wish, you can make one or more fields required, so that if an operator skips an entry, the upload fails.
- Click the header for the Run Fields section.
- For the InstrumentSetting field, check the Required checkbox.
- Click Finish.
- If you get the message The property "instrumentSetting" cannot be required when it contains rows with blank values, this means assay data has already been imported using this design without an instrument setting. You will need to delete the offending assay runs before you can set the field as required.
Using a regular expression (RegEx) to check entered text is a flexible form of validation. You could compare text to an expected pattern, or in this example, we can check that special characters like angle brackets are not included in an email address (as could happen in a cut and paste of an email address from a contact list).
- Reopen Manage Assay Design > Edit assay design.
- Select the OperatorEmail field in the "Batch Fields" section. Expand the section by clicking within the field name or using the (plus) icon on the right.
- Under Conditional Formatting and Validation Options, click Add Regex.
- Enter the following parameters:
- Regular Expression: .*[<>].*
- Note that regex patterns are matched on the entire string submitted, so including ".*" delimiters and enclosing the pattern with  (square brackets) will find the regex pattern within any longer string.
- Description: Ensure no angle brackets.
- Error Message: An email address cannot contain the "<" or ">" characters.
- Check the box for Fail validation when pattern matches field value. Otherwise, you would be requiring that emails contained the offending characters.
- Name: BracketCheck
- Click Apply.
For more information on regular expressions, see:
By checking that a given numeric value falls within a given range, you can catch some bad runs at the very beginning of the import process.
- Click Results Fields to open the section.
- Select the M3 field. Click the (expansion icon) on the right to open the settings panel.
- Under Conditional Formatting and Validation Options, click Add Range.
- Enter the following parameters:
- First Condition: Select Is Greater Than or Equal To: 5
- Second Condition: Select Is Less Than or Equal To: 100
- Error Message: Valid M3 values are between 5 and 100.
- Name: M3ValidRange
- Click Apply.
- Click Save when finished editing the assay design.
Observe Validation in Action
To see how data validation would screen for these issues, we'll intentionally upload some "bad" data which will fail the validation steps we just added.
- Click Assay Tutorial to return to the main folder page.
- In the Files web part, select the file /Assays/Generic/GenericAssay_BadData.xls.
- Click Import Data.
- Select Use GenericAssay and click Import.
- Paste in "John Doe <email@example.com>" as the OperatorEmail. Leave other entries at their defaults, saved from our prior imports.
- Click Next.
- Observe the next red error message: "Value 'John Doe <firstname.lastname@example.org>' for field 'OperatorEmail' is invalid. An email address cannot contain the "<" or ">" characters.
- Correct the email address entry to read only "email@example.com" as before.
- Click Next again and you will proceed, no longer seeing the error.
- Enter an Assay ID for the run, such as "BadRun".
- Delete the InstrumentSetting value which was autofilled based on your prior upload.
- Click Save and Finish.
The sequence in which validators are run does not necessarily match their order in the design.
- Observe the red error text: "Instrument Setting is required and must be of type Integer."
- Enter a value and click Save and Finish again.
- Observe error message: "Value '4.8' for field 'M3' is invalid. Valid M3 values are between 5 and 100." The invalid M3 value is included in the spreadsheet being imported, so the only way to clear this particular error would be to edit/save/reimport the spreadsheet.
There is no actual need to import bad data now that we have seen how it works, so cancel the import or simply click the Assay Tutorial
link to return to the home page.