The expression matrix assay ties expression-level information to sample and feature/probe information. After appropriate files are loaded into the system, users can explore microarray results by building queries and visualizations based on feature/probe properties (such as genes) and sample properties.

Expression data may be manually extracted from Gene Expression Omnibus (GEO), transformed, and imported to LabKey Server.

Tutorial steps:

Files loaded include:
  • Metadata about features/probes (typically at the plate level)
  • Sample information
  • Actual expression data (often called a "series matrix" file)

Review File Formats

In order to use the assay, you will need three sets of data: a run file, a sample set, and a feature annotation file.

The run file will have one column for probe ids (ID_REF) and a variable number of columns named after a sample found in your sample set. The ID_REF column in the run file will contain probe ids that will be found in your feature annotation file, under the Probe_ID column. All of the other columns in your run file will be named after samples, which must be found in your sample set.

In order to import your run data, you must first import your sample set and your feature annotation set. Your run import will fail if we are unable to find a match for your ID_REF value or for a sample in your sample set. If you don't have current files, you can use these small example files:

Set Up the Folder

  • Log in to your server and navigate to your "Tutorials" project. Create it if necessary.
    • If you don't already have a server to work on where you can create projects, start here.
    • If you don't know how to create projects and folders, review this topic.
  • Create a new folder named "Expression Matrix Tutorial". Choose the folder type "Assay."
  • Select (Admin) > Folder > Management and click the Folder Type tab.
  • Check the box for Microarray and click Update Folder.
  • Add a Sample Sets web part on the left.
  • Add a Feature Annotation Sets web part, also on the left.

Define Sample Set

  • Click the (Create New Sample Set) button.
  • On the Import Sample Set page, name your sample set. Here we use ExpressionMatrixSamples.
  • In the Name Expression field, enter this, telling the set to use the ID_REF column as the unique identifier for samples:
    ${ID_REF}
  • Click Create.
  • On the next page, use Add Field to create all the fields that will match your spreadsheet. Three built in fields, Name, Description, and Flag,, all of type Text, are always created and should not be included on this list. For each field enter a name without spaces and select a data type.
  • For our sample, enter:
    • ID_REF - Text
    • SampleA - Integer
    • SampleB - Integer
  • Scroll down and click Save.

Import Sample Information

Now that you have created the new sample set, you will see the name in the Sample Sets web part.

  • Click ExpressionMatrixSamples to open it.
  • Click Import More Samples.
  • Into the Data area, paste in a TSV of all your samples (or you can click Upload file... and upload the file directly).
  • Click Submit.

Add Feature Annotation Set

  • Return to the main folder pageby clicking the ExpressionMatrixTutorial link near the top.
  • In the Feature Annotation Sets web part, click Import Feature Annotation Set.
    • Enter the Name: Feature Annotations 1
    • Enter the Vendor: Vendor 1
    • For Folder: Select the current folder.
    • Browse to select the annotation file. (Or use the provided file sample_feature_annotation_set.txt.) These can be from any manufacturer (i.e. Illumina or Affymetrix), but must be a TSV file with the following column headers:
      Probe_ID 
      Gene_Symbol
      UniGene_ID
      Gene_ID
      Accession_ID
      RefSeq_Protein_ID
      RefSeq_Transcript_ID
    • Click Upload.

Create an Expression Matrix Assay Design

  • In the Assay List web part, click New Assay Design.
  • Select the Expression Matrix assay type.
  • Scroll down to select the Assay Location (for our samples, use the current folder).
  • Click Next.
  • Name your assay, adjust any fields if needed.
  • Click Save.

Import a Run

Runs will be in the TSV format and have a variable number of columns.

  • The first column will always be ID_REF, which will contain a probe id that matches the Probe_ID column from your feature annotation set.
  • The rest of the columns will be for samples from your imported sample set (ExpressionMatrixSamples).
An example of column headers:

ID_REF GSM280331 GSM280332 GSM280333 GSM280334 GSM280335 GSM280336 GSM280337 GSM280338 ...

An example of row data:

1007_s_at 7.1722616266753 7.3191207236008 7.32161337343459 7.31420082996567 7.13913363545954 ...

To import a run:

  • Navigate to the expression matrix assay you just created. (Click the name in the Assay List from the main page of your folder.)
  • Click Import Data.
  • Select the appropriate Feature Annotation Set.
  • Click Choose File and navigate to your series matrix file (or use the provided example file series_matrix.tsv).
  • Click Save and Finish to begin the import.

Note: Importing a run may take a very long time as we are generally importing millions of rows of data. The Run Properties options include a checkbox named Import Values. If checked, the values for the run are imported normally. If unchecked, the values are not imported to the server, but links between the series matrix, samples, and annotations are preserved.

View Run Results

After the run is imported, to view the results:

  • Click the file name (AssayId) in the runs grid.

There is also an alternative view of the run data, which is pivoted to have a column for each sample and a row for each probe id. To view the data as a pivoted grid:

  • Select (Admin) > Go to Module > Query
  • Browse to assay > ExpressionMatrix > [YOUR_ASSAY_NAME] > FeatureDataBySample.
  • Click View Data.
  • You can add this query to the dashboard using a Query web part.

Related Topics

Was this content helpful?

Log in or register an account to provide feedback


previousnext
 
expand allcollapse all