Implement an QueryUpdateService

LabKey Support Forum (Inactive)
Implement an QueryUpdateService lewis j mcgibbney  2015-10-02 11:51
Status: Closed
 
Hi Folks,
Having spoken with Adam a few times on this scenario now I thought I would log an issue and begin working on a QueryUpdateService API for the remoteapi Java API.

Lets take the following scenario as a driver for understanding what this API should provide. Say we want to learn of both 1) new Studies added to LabKey by our users, and 2) updates to existing Studies e.g. field updates or something similar. Basically a a typical CRUD update which is implemented through QueryService.

Right now (with the current remoteapi) we have two options;

1) from time-to-time query everything (all containers containing data) and check for what has been updated. This most likely involves maintaining a signature of what the state was when we last queried. We can then create a new signature and do a diff on the two. A difference is signatures indicates that something has been updated. We then pull the new data and do something with it. This is not ideal as we need to query LabKey frequently!, or

2) develop a listener which binds to some existing LabKey service. This is our integration point and justification behind the implementation of an QueryUpdateService. Updates should be sucked out automatically hence removing the requirement to constantly query the services.

Finally, I have looked into the QueryChangeListener which kinda sounds like the same. It exposes the following API

void queryChanged(User user, Container container, ContainerFilter scope, SchemaKey schema, QueryChangeListener.QueryProperty property, java.util.Collection<QueryChangeListener.QueryPropertyChange> changes)
This method is called when a set of tables or queries are changed in the given container and schema.

void queryCreated(User user, Container container, ContainerFilter scope, SchemaKey schema, java.util.Collection<java.lang.String> queries)
This method is called when a set of tables or queries are created in the given container and schema.

void queryDeleted(User user, Container container, ContainerFilter scope, SchemaKey schema, java.util.Collection<java.lang.String> queries)
This method is called when a set of tables or queries are deleted from the given container and schema.

java.util.Collection<java.lang.String>     queryDependents(User user, Container container, ContainerFilter scope, SchemaKey schema, java.util.Collection<java.lang.String> queries)
Get a textual representation of items that depdend on a table or query.

The above API does not satisfy the requirement for one to receive updates as and when they come in to LabKey!

I am very keen to discuss this here.
 
 
lewis j mcgibbney responded:  2015-10-05 21:41
Is anyone against my attempting to implement the above?
 
Ben Bimber responded:  2015-10-06 07:59
Hi Lewis,

a couple questions first: i understand the requirement about learning if a study was created/destroyed. to your second requirement, are you trying to figure out if the set of tables changed (i.e. dataset added/deleted) or structure of tables changed (i.e. user adds/removes a field on a dataset), or are you trying to figure out if a change happened to the data, like a standard insert/update/delete event? i could imagine all of these being important, but going after the latter is different than the first ones.

-Ben
 
lewis j mcgibbney responded:  2015-10-06 08:46
Hi Ben,
In short

> to your second requirement, are you trying to figure out if the set of tables changed (i.e. dataset added/deleted) or structure of tables changed (i.e. user adds/removes a field on a dataset), or are you trying to figure out if a change happened to the data, like a standard insert/update/delete event?

The latter. We want to know when some standard CRUD event happens.
Ta
 
Ben Bimber responded:  2015-10-06 09:53
Hi Lewis,

i can think of a couple approaches:

1) There is DatasetListener in StudyManager. This is non-public right now, but appears to fire the events you want (i didnt walk all the way through the code), assuming you only care about datasets. Adding this hook to StudyService so external modules could register their own listeners wouldnt be very invasive. This would mean you'd need to make a LK module running on your server. It would register this listener, and do something on change events. This could mean you ping some external process, or some external process periodically pings your module.

2) if you're dealing specifically w/ studies, the modified field on study.datasets might accomplish #1. this appears to get updated whenever there is a insert/update/delete on the rows of that table. for example:

https://yourServer.com/query/executeQuery.view?query.queryName=study&schemaName=study&query.containerFilterName=AllFolders

will list all studies (note container filter). if you walk the rows (which has a column for container), you can query study.datasets from each container w/ a study. this table has a field for modified, and you would want any dataset modified after your last sync time. any new study or new dataset would inherently be modified after the last sync. if i'm right about modified, then this would work purely from the client API, without any java code or changes.

3) there is a feature in LK to register javascript trigger scripts. This lets you provide JS code that is called on CRUD events for a given table. We use this very heavily in our modules. This being said, I wish is had java hooks, letting modules more easily register new handlers. This wouldnt be hard to introduce, but is more work than #2.
 
jeckels responded:  2015-10-06 14:04
Hi Lewis,

As you've probably guessed, QueryChangeListener is used to fire notifications that a query/table has changed its structure or schema, but not when data rows are inserted/updated/deleted.

I think that Ben's thought about trigger scripts is a good direction to investigate. If you haven't already found the docs, they're here:

https://www.labkey.org/wiki/home/Documentation/page.view?name=serverSideValidation

Each script is associated by naming convention with a specific schema and query. You'd likely want a more generic registration mechanism so that you don't need to add one per dataset, and to be able to register in Java code. You can see where they get invoked in:

server\api\src\org\labkey\api\etl\TriggerDataBuilderHelper.java : BeforeIterator.next(), AfterIterator.next()

and

server\api\src\org\labkey\api\query\AbstractQueryUpdateService.java : deleteRows(), updateRows(), etc

Perhaps adding a new Java interface for listeners to implement, having a registration mechanism on QueryService, and hooking it into TableInfo.fireRowTrigger() and TableInfo.fireBatchTrigger() might be a good approach to pursue?

However, this assumes that you have code that registers itself and runs within the LabKey Server process itself. A custom implementation could, of course, fire off messages to an external process.

Alternatively, you could also use the Java API to monitor for new audit events that indicate that dataset rows have changed (see the auditLog schema's DatasetAuditEvent query for how these look). This could be done external to the server, but would require polling for changes instead of being actively notified.

Thanks,
Josh