inconsistencies in dataclass lineage

General Server Forum (Inactive)
inconsistencies in dataclass lineage camillesultana  2017-11-02 12:00
Status: Closed
 
Hi!

I recently reported (2017-11-01 18:22) what appeared like a bug in regards to dataclass lineage. Since then I've discovered other issues which appear to be inconsistencies to me. It could also be I don't have a good understanding of how dataclasses are meant to perform. Either way, I would definitely appreciate some advice.

Issue 1. Accessing detailed information regarding Runs
I am trying to query information regarding the inputs and outputs of runs. When I try to customize the grid view for a specific datclass and navigate to Inputs/Runs/All and then try to expand any of the fields with the little plus sign next to it, I get an error. For example when I try to expand Inputs/Runs/All/DataOutputs I get the error "The column 'Inputs/Runs/All/DataOutputs' is not a foreign key!". I seem to get this "foreign key" error if I try to expand any "4th level" fields under inputs or outputs (i.e. anything inputsORoutputs/x/x/x doesn't expand even if it has the + sign next to it). I tried to examine these same fields under the schema browser. Once again for any of these 4th level fields with a + sign next to them the "Description" indicates that this field is a lookup or identifier which should link to other fields/tables. However when I click on the +, the table expands like it is trying to load in the field information, but nothing happens and I just see "loading...".

Issue 2. Lineage inconsistent when updating existing records
I have created a very simple dataclass called testlink. When I update the lineage for existing records there is inconsistent behavior between information displayed in the datagrid view and the information displayed in the single record details view.

For example I initialize three records, where test41 has the input(parent) test40 and output(child) test42 (image1). The parent information within the single record view and datagrid (utilizing the Inputs/Data/testLink field) are consistent and correct. Note I'm not discussing the Outputs/Data/testLink in the datagrid view, as none of the correct information seems to be making it into that field at any point as I previously reported (2017-11-01 18:22) . If I create a new record that lists an existing record, which already has child data, the new and existing child are both listed under "child" for the existing record and the new record has the appropriate parent. However, if I create a new record which lists an existing record (which already has a parent) as a child, then a discrepancy arises where the existing record only lists the new record as the parent in the single record details view, but the datagrad shows both the previous and the new record as parents of the the existing record. In addition the previous parent still lists the existing record as a child in the single record view. For the example I've given, test41 has test40 as a parent. I then created a new record test43, which lists test41 as a child(dataOutput). Now test41 shows test40 and test43 as parents(inputs) in the datagrid view (image2) but only test43 in the single record view (image3). However, test40 still lists test41 as a child in the single record view (image4). Ideally I would like all previous lineage relationships to be maintained and not replaced unless explicitly done so (which is what the datagrid view seems to be doing). But either way it seems like the datagrid view and the single record view should provide the same information.

Would definitely like to know if these are indeed bugs, or if my understanding of how dataclasses should perform is flawed.

Thanks!
Camille
 
 
Jon (LabKey DevOps) responded:  2017-11-06 01:05
Hi Camille,

As requested in https://www.labkey.org/home/Support/Support%20Forum/announcements-thread.view?rowId=16457, which is focused on Issue #2, please provide a step-by-step on how you're constructing your dataclasses and any other actions you're taking.

Is the same error you're seeing in Issue #1 coming from the same dataclass setup?

With regard to the lookups in the grid, it is possible for there to be issues since expanding those "+" symbols essentially does a JOIN, so going down four levels would be creating multiple JOINs across several tables.

It is not typical to have to grab a field that is nested four levels deep like how you described. Is there any specific reason why you need that information to be pulled in to the grid?

Regards,

Jon
 
camillesultana responded:  2017-11-06 14:39
Hi Jon,

Ok doke. I have detailed below specific steps to generate this issue. However, this inconsistency is coming up for any dataclass I generate. I am trying to pull this information into a grid, as I want to generate a query that easily shows the entire "run" history for a record, including any runs where it is a "input" and any runs where it is an "output". This is meant to be part of an activity log for a single dataclass record, so I can easily view any specific changes that have been made to the lineage, and all relevant metadata (who, when, etc). After playing around some more, it seems like I can get at this more directly. If I go to the schema browser and then exp --> built-in --> runs then I can expand the Input, Output, Data Input, and Data Output fields/lookups and the variety of other look-ups at this level. So it looks like I have it figured out for my own needs, but it still seems like this info should be able to be accessed as I describe below.

1. I have under the home folder, a folder named "PratherData" with two tabs that I have set up (FindInfo and AddEditInfo).
2. When I go to AddEditInfo tab and the DataClass webpart, I click "insert new row".
3. I then give a name and description for the DataClass and then click create.
4. On the next screen I hit save, without adding additional fields. Don't need them to test the lineage.
5. Then on the dataclass page on the datagrid Insert --> Import Bulk Data.
6. I then create an excel file with three new records in it (new1, new2, and new3) where new2's parent is new1 and child is new3.
7. I write this file to a tab delimited text file and copy the text in the file to the "copy/paste text" field.
8. I click on new2 and examine the run information at the bottom. I've modified the run datagrid view to include DataInputs--> Name (renamed as DataInputName) and DataOutputs-->Name (renamed as DataOutputName). The correct information is shown here.
9. If I go back to the datagrid view for the dataclass I am unable to add this information. I click "show hidden fields" and then follow Inputs--> Runs --> All. At this point I try to expand (+ sign) the "DataOutputs" and "DataInputs" fields/lookups but get the error "The column 'Inputs/Runs/All/DataOutputs' is not a foreign key!".
10. I am able to check the box next to DataOutputs and DataInputs but the information provided is some sort of rowID and therefore not readily helpful. I n addition it seems like the info being displayed is incorrect as DataOutputs and DataInputs is incorrect.

Once again I'm running Windows 7, Chrome, and LabKey 17.2.

Hopefully this helps!
Camille
 
Jon (LabKey DevOps) responded:  2017-11-10 20:02
Hi Camille,

I've followed up with you on question #2 in the other forum post:

https://www.labkey.org/home/Support/Support%20Forum/announcements-thread.view?rowId=16457

Regarding the first question, is there any reason why you need to drop down to such a low-level nested column? Trying to go four-levels deep is not something typically done and each level you go down is essentially the same thing as a Query within a Query, which you're going four queries deep.

Regards,

Jon