Expression Matrix Assay Tutorial

Documentation
The expression matrix assay ties expression-level information to sample and feature/probe information. After appropriate files are loaded into the system, users can explore microarray results by building queries and visualizations based on feature/probe properties (such as genes) and sample properties.

Expression data may be manually extracted from Gene Expression Omnibus (GEO), transformed, and imported to LabKey Server. For details see Loading Public Protein Annotation Files.

Files loaded include:

  • Metadata about features/probes (typically at the plate level)
  • Sample information
  • Actual expression data (often called a "series matrix" file)

Enable the Expression Matrix Module

The Expression Matrix assay is part of the microarray module.

Review File Formats

In order to use the assay, you will need three sets of data: a run file, a sample set, and a feature annotation file.

The run file will have one column for probe ids (ID_REF) and a variable number of columns named after a sample found in your sample set. The ID_REF column in the run file will contain probe ids that will be found in your feature annotation file, under the Probe_ID column. All of the other columns in your run file will be named after samples, which must be found in your sample set.

In order to import your run data, you must first import your sample set and your feature annotation set. Your run import will fail if we are unable to find a match for your ID_REF value or for a sample in your sample set.

Set up the Expression Matrix Assay

  • Create a new folder of type Microarray.
  • Add the Sample Sets web part to the Microarray Dashboard tab.
  • Click the Import Sample Set button.
  • On the Import Sample Set page, name your sample set ExpressionMatrixSamples.
  • In the sample set data text area, paste in a TSV of all your samples.
  • In the Id Columns section, make the appropriate Name column an ID column.
  • Save your sample set.
  • Return to the Microarray Dashboard.
  • Add a Feature Annotation Sets web part at the bottom of the left column.
  • Click Import Feature Annotation Set.
    • Enter the name, vendor, description, folder.
    • Browse to select the annotation file. These can be from any manufacturer (i.e. Illumina or Affymetrix), but must be a TSV file with the following column headers:
Probe_ID 
Gene_Symbol
UniGene_ID
Gene_ID
Accession_ID
RefSeq_Protein_ID
RefSeq_Transcript_ID
    • Click Upload.

Create a New Assay Design

  • Select the ExpressionMatrix assay type
  • Name your assay and save it

Import a Run

Runs will be in the TSV format and have a variable number of columns.

  • The first column will always be ID_REF, which will contain a probe id that matches the Probe_ID column from your feature annotation set.
  • The rest of the columns will be for samples from your imported sample set (ExpressionMatrixSamples).
An example of column headers:

ID_REF GSM280331 GSM280332 GSM280333 GSM280334 GSM280335 GSM280336 GSM280337 GSM280338 ...

An example of row data:

1007_s_at 7.1722616266753 7.3191207236008 7.32161337343459 7.31420082996567 7.13913363545954 ...

To import a run:

  • Navigate to your ExpressionMatrix assay
  • Import run data

Note: Importing a run may take a very long time as we are generally importing millions of rows of data.

View Run Results

After the run is imported, to view the results:

  • Click the file name in the runs grid

There is also an alternative view of the run data, which is pivoted to have a column for each sample and a row for each probe id. To view the data as a pivoted grid:

  • Select Admin > Developer Links > Schema Browser
  • Browse to Assay > ExpressionMatrix > [YOUR_ASSAY_NAME] > FeatureDataBySample

Discussion

previousnext
 
expand all collapse all