Suggestions for Extensible Assays | Ben Bimber | 2009-08-29 13:48 | |||||||||||||||||||||||
Status: Closed | |||||||||||||||||||||||||
We are starting a project that will probably involve creating upwards of 30 custom assays for our organization. I spent this morning playing around making assays within simple modules. I'm very excited about this new functionality and believe this is exactly what labkey has been missing and should be a great way to handle this project. I realize that text-assays are very new and probably still evolving; however, there are some tweaks that could make some pretty dramatic improvements to it. My apologies if any of this already exists and I missed it. Each assay tends to have little quirks associated with it. The existing extensible assay framework does a good job of accommodating this. The ability to define custom HTML files to be used instead of the default wizard or grids is very useful. However, if for each assay we end up replacing all the defaults with custom HTML views, we lose much of the benefit of having an assay manager. I have a few suggestions that might help further improve the extensibility of this framework. Clearly it is not possible to make an infinitely customizable import/viewing framework. I’ve tried to come up with tweaks to the behavior of the default upload/batch/run/results pages that permit these pages to be used as much as possible, without needing to replace them with custom HTML. I’m obviously biased toward the sort of data we plan to store, but I think a lot of these suggestions could be very universal. It is also quite possible that better solutions exist to address the obstacles I identify than the solutions I propose. Import / Validation: 1. Rather than having a single upload wizard that is either used or replaced, you might want to break this into component steps. This way the user could create custom HTML to replace just the sample import or validation step while keeping the pre-defined behavior for each other step. Maybe there is the option of inserting a custom step between two pre-defined ones. 2. On import, it would be useful if we could create an HTML view that is simply appended above or below the existing views for a given import step, rather than replacing the wizard. This view might present the user with additional information, based on their imported records, that helps guide decisions. Take elispot: it might be useful to alert users if there are existing records with the same SubjectID/PeptideNumber as one of the samples being imported. That same paradigm applies to a lot of assays. In some cases this appended view might be as simple as adding custom HTML text with extra instructions unique to that assay. 3. It would be most useful if the validation/processing scripts could be defined field by field, rather than one per assay (maybe in combination with one per assay). For example, multiple assays might contain a field that holds a DNA sequence. This field gets the same validation/processing steps for each assay that imports DNA. We could reference the same script each time, rather than continually remaking it. 4. In the XML that defines an assay, we should be able to specify that a field gets hidden at import (applies to batch, run and result fields). This means that when the user downloads the import excel sheet, columns for these field(s) do not exist. This field may or may not have a default value. These sort of fields could include status flags or hold other information that is only added or used after a record is in the system. 5. Text-based assay / controlling allowable values for a field. There are many cases where we want to restrict allowable values for a field. If we’re using text-based assays, we’d want some method to define the list of allowable values within the text-based assay design (or at least define a list and establish the foreign key). this way when we export the assay this information is included. It would be best if these allowable values were created as a list so they could be edited in the future. However, if they are only editable by changing text or XML files within the assay design that’s ok too. 6. There are cases when we need to retain the ability to define the value of a field uniquely for each sample, but the vast majority of the time this field is the same for all samples within that run. It would be nice if there was an option (that could be enabled/disabled for a field) allowing that field to be filled-out alongside the run fields. If the user enters a value, this value would be assigned to each sample of that run. In this case, there would not be a column for this field when the user downloads the excel import sheet. When creating a run, the user should have the option of selecting something like ‘I need to define this individually for each sample’, in which case the excel sheet contains that column. Alternately, maybe we have the ability to define a default value for a sample-field when creating a run. If the column in the excel sheet is left blank, the default value is used. If the user enters a value for that sample, then this value is used. 7. Single sample import. There can be instances when a user will want to import a single sample into an assay. Rather than forcing the user through the traditional upload wizard, ideally they could hit an ‘import single record’ button, which gives them a web form similar to ‘insert new’ for any other list. In this view, the batch, run and results fields are all shown, completed, then imported. They would end up creating a run with one sample without actually knowing it. Batch, Run and Results Views: 1. ‘Replace default view with a query’. Once assay data is in the system, you can view grids of assay data by batch, run or results. You can define the default view for each of these grids, which is important because the view people want to see if often not the same as the underlying table. People will want to hide some fields, use foreign keys to pull in other values, etc. An important piece this is currently missing is the ability to add calculated fields. For example, we might want to write an SQL expression that calculates a new value based on fields in that sample (the raw data may not reflect what people want to interact with). Or we might want to write an SQL expression that returns a human-friendly value like ‘positive’ or ‘negative’ based on calculations using fields in that record. A simple solution to this might be to allow the default view for any table to be replaced by an SQL query. Technically we can already create a query from a table; however, this gets kinda clunky. Rather than writing a new ‘batches.html’, ‘run.html’ and ‘’results.html’ for each assay, it would be simplest if we could just define which query is used for that page. 2. ‘Actions’. There are any number of things that people might want to do to or do with assay data once in the system, specific to that particular assay. We handle this in our existing system with an idea I took from Geospiza’s Finch. When looking at a grid of assay data, users can check one or more records, then pick from a pull-down menu which gives a list of actions (see attached screenshot ‘actions.jpg’). Each of these actions takes the user to a custom HTML page that does something to the checked records or provides some sort of customization visualization based on them. To give examples: for sequence data, I could write an HTML/javascript page that exports checked records as a FASTA file, instead of CSV. I might have a page that allows batch editing of a specific field. In the attached screenshot, I have an action called ‘mark removed’. ‘Removed’ is a field of that assay with a default value of 0. The response page simply changes this value on selected records to 1. I might have an assay in which I pick some records, then select the action ‘compare results’. The corresponding page performs some calculations on these records then presents the user with a custom output. This sort of framework creates a really flexible interface for working with assay data (or any list really). It seems that within a simple module there would be an assay/actions/ folder (or the actions could be defined using XML). Any HTML page within the folder would appear as an action for that assay. As above, I think this addition could save a whole lot of cases where users otherwise need to create unique ‘batches.html’, ‘run.html’ and ‘’results.html’ files. 3. ‘Display as URL’. Pretty much any case where a field is a foreign key, that field should be a link to the corresponding table. This sort of behavior is a huge benefit when an organization has multiple assays or types of data in labkey. For example, in the ELISPOT grid, the peptide field should automatically link to the peptides table, providing the user with quick access to details on that record. If this behavior is not going to be the default for batch/run/results grids, then we should have the option of enabling it during the assay design. Likewise, we might want to have a given field displayed as a link to some other URL. For example, if a field is a Genbank accession number, we might want it to link directly to genbank. As above, it would be great if we could specify this in the XML that defines that assay. General: 1. When defining an assay, we create XML files for batch, run and results fields. Each field has something like this: <exp:PropertyDescriptor> <exp:Name>TimePoint</exp:Name> <exp:Required>true</exp:Required> <exp:RangeURI>http://www.w3.org/2001/XMLSchema#dateTime</exp:RangeURI> </exp:PropertyDescriptor> It would be great if a host of other options can get specified here (or in some other XML file). The extra properties could dictate behavior ranging from validation to display behavior in a gridview. Most of the things I suggest above could be properties defined in something like this. I realize it is not possible to have infinite customization, but the advantage of building as many options as possible at this level is that we take advantage of professionally developed code and need to create/maintain as little redundant code as possible. I don’t know how complicated this is, but for things like display behavior, it would be even better if any query based on this assay could inherit the attributes of the fields is uses. For example, if we define that an assay field should be displayed as a URL in a grid, any query that uses this field will also display it as a URL unless this property is changed within that query’s XML. If this becomes complicated, as long as those same properties can be defined using XML, it should be ok. General Suggestions For Assays/Lists: 1. For a view, it would be nice if we could specify the default page size. 2. We should be able to define whether a field is editable or read-only. Currently we can only specify whether the entire assay’s data can be edited or not. There’s lot of cases where we’d want to permit some values to be changed (assuming a user has permissions), but other fields should never be edited. 3. After CSV import, if a record fails validation labkey currently does not provide a lot of information. If I import a file with 50 rows and only 1 of these has an invalid date, it still says ‘date must be of type DateTime’. Especially as validation becomes more complicated, it would be far more helpful if labkey displayed a grid with the contents of the problem row(s), indicating the cell(s) that failed and why. If the validation triggered warnings only (meaning the user can choose to import anyway) perhaps it makes sense to display the grid of problem rows, indicating the warnings, but to have a button allowing import to proceed. |
|||||||||||||||||||||||||
| |||||||||||||||||||||||||