Study Datasets

The datasets in a study can be:

  • Demographic: One row per participant
  • Clinical: One row per participant/timepoint pair.
  • Assay/Specimen: May have multiple rows per participant/timepoint.
One simple way to create a new dataset is by importing an Excel file containing the data. The column names and types will be inferred from the file for you to adjust as needed.

In this step, we will import Demographic and Clinical data into the study and learn more about timepoints.

Create a Demographic Dataset

Each study needs at least one demographic dataset identifying the participants in the study. Demographic data is information that will not change over the study time, such as birthplace or enrollment date.

  • Click the Manage tab.
  • Click Manage Datasets.
  • Click Create New Dataset.
  • On the Define Dataset screen:
    • Short Dataset Name: Enter "Demographics"
    • Leave the "Define Dataset Id Automatically" box checked.
    • Select the Import from File checkbox.
  • Click Next.
  • Click Choose File.
  • Browse to the sample directory you unzipped and select the file: [LabKeyDemoFiles]\Datasets\Demographics.xls.
  • You will see a preview of the imported dataset. Notice that the sample files we provide already have columns that are mapped to the required server columns "ParticipantId" and "Visit Date". When importing your own datasets, you may need to explicitly set these pulldowns which establish dataset keys.
  • Review the field names and data types and click Import.
  • You will see this dataset:

You have created your first dataset, and can see the ParticipantID and Date columns that will be used to integrate other information about these participants. Next, explicitly mark this dataset as demographic, signalling there will only be one row for each participant in the study:

  • Click Manage in the link bar above the grid to manage this dataset.
  • Click Edit Definition.
  • Check the Demographic Data checkbox.
  • Click Save to return to the dataset definition.
  • Click View Data to return to the grid.

Explore Timepoints

Before adding any more data, let's review how time points work in this time based study.

  • On the Manage tab, click Manage Timepoints.
  • Notice that based on the study start date you entered (2008-04-01) and timepoint duration of 28 days, the import of demographic data has created four timepoints:
  • Click (Edit) for the "Day 0-27" timepoint.
  • Edit the Label to read "Month 1" and review the other timepoint properties available.
  • Click Save and notice the label is now updated.
  • Repeat these steps, changing the other three timepoints to read "Month 2", "Month 3", and "Month 5". There was no demographic data collected during month 4, so that timepoint has not been created.

If you return to the Overview tab and click Study Navigator, you can see this very rudimentary study schedule showing how many sets of demographic data were collected each month of the study (so far).

Now let's add the data collected over the following years of study.

Import Clinical Datasets

Two other .xls files provided in the sample datasets folder contain clinical data. Each time a new test or exam was performed on the participant, a new row of data was generated. There will be multiple rows per participant, but only one row per participant and date combination.

  • Lab Results.xls
  • Physical Exam.xls
To import this data, repeat the following steps for each file

  • Click the Manage tab.
  • Click Manage Datasets.
  • Click Create New Dataset.
  • On the Define Dataset screen:
    • Short Dataset Name: Enter the name of the file being imported (without the file extension and with spaces added between words).
    • Leave the "Define Dataset Id Automatically" box checked.
    • Select the checkbox Import from File.
  • Click Next.
  • Browse to the file, select it, and review the fields being imported. Even in the first 5 rows, you can see multiple rows for a single participant.
  • Notice that the ParticipantID and Visit Date column mappings are set as expected.
  • Click Import.

There is no need to make any changes to the dataset definition to reflect that these are clinical datasets, the column mapping during import was sufficient.

Revisit Timepoints

The imported data has been assigned to timepoints based on the date column. If a timepoint didn't exist for additional data, a new timepoint was created.

  • On the Manage tab, click Manage Timepoints.
  • Scroll to see the list of timepoints. In addition to the four you renamed above, many more have been created, based on the dates when clinical data was collected. In particular, notice that there is now data in "month 4" but until you edit the label, the default "Day 84-111" is used.

There is no need to edit the labels for the new timepoints to complete this tutorial.

When you reopen the Study Navigator on the Overview tab, you can see all three datasets and how much data was collected in each timepoint.

Participant Specific Start Dates

The overall study start date you entered when you created the study is used by default. Timepoints are assigned to data based on computing the number of days between the start date and the date attached to the data entered. If you prefer using an individual start date for each participant, include a "StartDate" column in your demographic dataset.

The example demographics dataset in this case has such a column, so we can align data by elapsed time a participant has been enrolled. The fact that participants joined the study over 5 months is not relevant.

  • On the Manage tab, click Manage Timepoints.
  • Under Timepoints, click Recompute Timepoints.
  • You will see the number of rows that have been updated.
  • Click the Overview tab, then Study Navigator to see the realigned data.

Previous Step | Next Step


expand all collapse all