Step 3: Integrate Data from Different Sources

_Documentation
A common challenge for researchers is combining their clinical data with their laboratory data. In this step we will show how LabKey Server solves this common data integration problem. We will combine two heterogeneous datasets into one and then build visualizations based on the combined result.

Examine the Two Source Datasets

  • Click the Clinical and Assay Data tab.
  • Open these two datasets in different browser tabs so you can view and compare side by side. Right-click the datasets below and select "Open in New Tab" to open separate tabs:
    • Physical Exam - This dataset captures the vital signs of the study participants: blood pressure, pulse and respiration rates, etc.
    • Lab Results - This dataset captures the laboratory work done on the blood samples provided by the participants: lymphocyte levels, HIV counts, etc.

Together, these two tables should give a comprehensive picture of the participants' hematological health and there may be relationships that can be detected in the combined data. For example, is there a relationship between the blood pressure data (in the Physical Exam set) and the lymphocyte levels (in the Lab Results set)? Or other relationships? To answer these questions we need to put all the data in one bucket somehow. How do we combine these two tables so that we can see all of the information in one grid?

Create a Combined Grid

Here we create a joined grid view combining all of the blood-related data:

  • In the Physical Exam dataset, select Grid Views > Customize Grid.
  • In the Available Fields pane, scroll down and click the + symbol next to DataSets.
  • Scroll down and click the + symbol next to Lab Results.
  • Place checkmarks next to all of these blood-related fields: CD4+, Lymphs, Hemoglobin, Viral Load.
  • Scroll back up and remove checkmarks next to: Clinician Signature/Date, Pregnancy, Form Language (to remove clutter).
  • Click Save, select Named and name the grid "Hema/Cardio Data".
  • Check the box to Make this grid available to all users.
  • Click Save.

Create a Visualization

We now have an integrated grid view of the all of the participants' hematological data, which you can see by selecting Grid Views > Hema/Cardio Data. Values from the Lab Results dataset are added to the Physical Exam dataset if available for the same participant and date combination. Now we can start making soundings into this combined data to see if there are any relationships to be discovered.

First let's create a scatter plot to see if there is a relationship between the lymphocyte levels and the blood pressure levels.

  • If necessary, return to the Physical Exam dataset and select Grid Views > Hema/Cardio Data.
  • Select Charts > Create Chart to open the plot creation dialog.
  • Click Scatter on the left.
  • Drag and drop the Systolic Blood Pressure column as the X Axis.
  • Drag and drop the Lymphs (cells/mm3) column as the Y Axis.
  • Click Apply.
  • The scatter plot is displayed. (A quick visual check suggests there is no relationship, at least not in this data.)
  • Save the plot with the name of your choice.
  • Experiment with the chart tools to see if you can discover any relationship within the data.
    • Chart Type lets you add grouping or point shaping by other columns, such as cohort or demographic
      • For instance, try dragging the "Treatment Group" column to the "Color" field.
    • Chart Layout provides more options for changing the chart title, size, coloring, etc.
  • You can save or save as a new copy.

Previous Step | Next Step


previousnext
 
expand allcollapse all