Table of Contents |
guest 2025-05-23 |
LabKey Server helps researchers automate high-volume flow cytometry analyses, integrate the results with many kinds of biomedical research data, and securely share both data and analyses. The system is designed to manage large data sets from standardized assays that span many instrument runs and share a common gating strategy. It enables quality control and statistical positivity analysis over data sets that are too large to manage effectively using PC-based solutions.
An investigator first defines a gate template for an entire study using FlowJo, and uploads the FlowJo workspace to LabKey. He or she then points LabKey Flow to a repository of FCS files.
Once the data has been imported, LabKey Server starts an analysis, computes the compensation matrix, applies gates, calculates statistics, and generates graphs. Results are stored in a relational database and displayed using secure, interactive web pages.
Researchers can define custom queries and views to analyze large result sets. Gate templates can be modified, and new analyses can be run and compared.
To get started, see the introductory flow tutorial: Tutorial: Explore a Flow Workspace
LabKey Server enables high-throughput analysis for several types of assays, including flow cytometry assays. LabKey’s flow cytometry solution provides a high-throughput pipeline for processing flow data. In addition, it delivers a flexible repository for data, analyses and results.
Traditionally, analysis of flow cytometry data begins with the download of FCS files from a flow cytometer. Once these files are saved to a network share, a technician loads the FCS files into a new FlowJo workspace, draws a gating hierarchy and adds statistics. The product of this work is a set of graphs and statistics used for further downstream analysis. This process continues for multiple plates. When analysis of the next plate of samples is complete, the technician loads the new set of FCS files into the same workspace.
Moderate volumes of data can be analyzed successfully using FlowJo alone; however, scaling up can prove challenging. As more samples are added to the workspace, the analysis process described above becomes quite slow. Saving separate sets of sample runs into separate workspaces does not provide a good solution because it is difficult to manage the same analysis across multiple workspaces. Additionally, looking at graphs and statistics for all the samples becomes increasingly difficult as more samples are added.
LabKey Server can help you scale up your data analysis process in two ways: by streamlining data processing and by serving as a flexible data repository. When your data are relatively homogeneous, you can use your LabKey Server to apply an analysis script generated by FlowJo to multiple runs. When your data are too heterogeneous for analysis by a single script, you can use your LabKey Server as a flexible data repository for large numbers of analyses generated by FlowJo workspaces. Both of these options help you speed up and consolidate your work.
Analyses performed using FlowJo can be imported into LabKey where they can be refined, quality controlled, and integrated with other related data. The statistics calculated by FlowJo are read upon import from the workspace.
Graphs are generated for each sample and saved into the database. Note that graphs shown in LabKey are recreated from the Flow data using a different analysis engine than FlowJo uses. They are intended to give a rough 'gut check' of accuracy of the data and gating applied, but are not straight file copies of the graphs in FlowJo.
Extra information can be linked to the run after the run has been imported via either LabKey Flow or FlowJo. Sample information uploaded from an Excel spreadsheet can also be joined to the well. Background wells can then be used to subtract background values from sample wells. Information on background wells is supplied through metadata.
You can use LabKey Server exclusively as a data repository and import results directly from a FlowJo workspace, or create an analysis script from a FlowJo workspace to apply to multiple runs. The dashboard will present relevant tasks and summaries.
The LabKey Flow module automates high-volume flow cytometry analysis. It is designed to manage large data sets from standardized assays spanning many instrument runs that share a common gating strategy.
To begin using LabKey Flow, an investigator first defines a gate template for an entire study using FlowJo, and uploads the FlowJo workspace to LabKey Server. He or she then points LabKey Flow to a repository of FCS files on a network file server, and starts an analysis.
LabKey Flow computes the compensation matrix, applies gates, calculates statistics, and generates graphs. Results are stored in a relational database and displayed using secure, interactive web pages.
Researchers can then define custom queries and views to analyze large result sets. Gate templates can be modified, and new analyses can be run and compared. Results can be printed, emailed, or exported to tools such as Excel or R for further analysis. LabKey Flow enables quality control and statistical positivity analysis over data sets that are too large to manage effectively using PC-based solutions.
If you are using FlowJo version 10.8, you will need to upgrade to LabKey Server version 21.7.3 (or later) in order to properly handle "NaN" (Not a Number) in statistic values. Contact your Account Manager for details.
Follow the steps in this topic to set up a "Flow Tutorial" folder. You will set up and import a basic workspace that you can use for two of the flow tutorials, listed at the bottom of this topic.
More details and options for importing workspaces are covered in the topic: Import a Flow Workspace and Analysis. To simply set up your environment for a tutorial, follow these steps:
When the files are in the same folder as the workspace file and all files are already associated with samples present in the workspace, you will skip the next two panels of the import wizard. If you need to adjust anything about either step, you can use the 'Back' option after the acceleration. Learn about completing these steps in this topic: Import a Flow Workspace and Analysis.
You are now set up to explore the Flow features and tutorials in your folder.
This tutorial teaches you how to:
The main Flow dashboard displays the following web parts by default:
In the flow workspace, statistics column names are of the form "subset:stat". For example, "Lv/L:%P" is used for the "Live Lymphocytes" subset and the "percent of parent" statistic.
Graphs are listed with parentheses and are of the form "subset(x-axis:y-axis)". For example, "Singlets(FSC-A:SSC-A)" for the "Singlets" subset showing forward and side scatter.
The columns displayed by default for a dataset are not necessarily the ones you are most interested in, so you can customize which columns are included in the default grid. See Customize Grid Views for general information about customizing grids.
In this first tutorial step, we'll show how you might remove one column, add another, and save this as the new default grid. This topic also explains the column naming used in this sample flow workspace.
You will now see the "Compensation Matrix" column is gone, and the "Singlets:%P" column is shown in the grid.
Notice that the graph columns listed as "selected" in the grid customizer are not shown as columns. The next step will cover displaying graphs.
In this step we will examine our data graphs. As you saw in the previous step, graphs are selected within the grid customizer but are not shown by default.
See a similar online example.
The following pages provide other views and visualizations of the flow data.
Detailed statistics and graphs for each individual well can be accessed for any run. In this tutorial step, we review the details available.
You can see a similar online example here.
Before you export your dataset, customize your grid to show the columns you want to export. For greater control of the columns included in a view, you can also create custom queries. Topics available to assist you:
After you have finalized your grid, you can export the displayed table to an Excel spreadsheet, a text file, a script, or as an analysis. Here we show an Excel export. Learn about exporting in an analysis archive in this topic:Note that export directly to Excel limits the number of rows. If you need to work around this limitation to export larger datasets, first export to a text file, then open the text file in Excel.
Quality control reports in the flow module can give you detailed insights into the statistics, data, and performance of your flow data, helping you to spot problems and find solutions. Monitoring controls within specified standard deviations can be done with Levey-Jennings plots. To generate a report:
The report displays results over time, followed by a Levey-Jennings plot with standard deviation guide marks. TSV format output is shown below the plots.
You have now completed the Tutorial: Explore a Flow Workspace. To explore more options using this same sample workspace, try this tutorial next:
If you have a flow cytometry experiment containing a background control (such as an unstimulated sample, an unstained sample, or a Fluorescence Minus One (FMO) control), you can set the background in LabKey for use in analysis. To perform this step we need to:
To gain familiarity with the basics of the flow dashboard, data grids, and using graphs, you can complete the Tutorial: Explore a Flow Workspace first. It uses the same setup and sample data as this tutorial.
Sample descriptions give the flow module information about how to interpret a group of FCS files using keywords. The flow module uses a sample type named "Samples" which you must first define and structure correctly. By default it will be created at the project level, and thus shared with all other folders in the project, but there are several options for sharing sample definitions. Note that it is important not to rename this sample type. If you notice "missing" samples or are prompted to define it again, locate the originally created sample type and restore the "Samples" name.
Download these two files to use:
${TubeName}
You have now defined the sample type you need and will upload the actual sample information next.
Once it is uploaded, you will see the sample type.
You will see how many FCS Files could be linked to samples. The number of files linked to samples may be different from the number of sample descriptions since multiple files can be linked to any given sample.
To see the grid of which samples were linked to which files:
Keywords imported from FlowJo can be used to link samples, as shown above. If you want to add additional information within LabKey, you can do so using additional keywords.
If some of the samples were of another type, you could repeat the "Edit Keywords" process for the subset of rows of that other type, entering the same "SampleType" keyword and the alternate value.
Setting metadata including participant and visit information for samples makes it possible to integrate flow data with other data about those participants.
Background and foreground match columns are used to identify the group -- using subject and timepoint is one option.
Background setting: it is used to identify which well or wells are background out of the group of wells. If more than one well is identified as background in the group, the background value will be averaged.
Return to the view customizer to make the source of background data visible:
Display graphs by selecting Show Graphs > Inline as shown here:
You have now completed the tutorial and can use this process with your own data to analyze data against your own defined background.
Once you have set up the folder and uploaded the FCS files, you can import a FlowJo workspace and then use LabKey Server to extract data and statistics of interest.
This topic uses a similar example workspace as in the Set Up a Flow Folder walkthrough, but includes intentional mismatches to demonstrate the full import wizard in more detail.
Warnings (2):
Sample 118756.fcs (286): 118756.fcs: S/L/-: Count statistic missing
Sample 118756.fcs (286): 118756.fcs: S/L/FITC CD4+: Count statistic missing
The tutorial does not contain any such warnings, but if you did see them with your own data and needed to import these statistics, you would have to go back to FlowJo, re-calculate the missing statistics, and then save as xml again.
2. Select FCS Files If you completed the flow tutorial, you may have experienced an accelerated import wizard which skipped steps. If the wizard cannot be accelerated, you will see a message indicating the reason. In this case, the demo files package includes an additional file that is not included in the workspace file. You can proceed, or cancel and make adjustments if you expected a 1:1 match.
To proceed:
When importing analysis results from a FlowJo workspace or an external analysis archive, the Flow Module will attempt to find a previously imported FCS file to link the analysis results to.
The matching algorithm compares the imported sample from the FlowJo workspace or external analysis archive against previously imported FCS files using the following properties and keywords: FCS file name or FlowJo sample name, $FIL, GUID, $TOT, $PAR, $DATE, $ETIM. Each of the 7 comparisons are weighted equally. Currently, the minimum number of required matches is 2 -- for example, if only $FIL matches and others don't, there is no match.
While calculating the comparisons for each imported sample, the highest number of matching comparisons is remembered. Once complete, if there is only a single FCS file that has the max number of matching comparisons, it is considered a perfect match. The import wizard resolver step will automatically select the perfectly matching FCS file for the imported sample (they will have the green checkmark). As long as each FCS file can be uniquely matched by at least two comparisons (e.g., GUID and the other keywords), the import wizard should automatically select the correct FCS files that were previously imported.
If there are no exact matches, the imported sample will not be automatically selected (red X mark in the wizard) and the partially matching FCS files will be listed in the combo box ordered by number of matches.
The names of Statistics and Graphs in the imported workspace cannot be longer than 400 characters. FlowJo may support longer names, but they cannot be imported into the LabKey Flow module. Names that exceed this limit will generate an import error similar to:
11 Jun 2021 16:51:21,656 ERROR: FlowJo Workspace import failed
org.labkey.api.query.RuntimeValidationException: name: Value is too long for column 'SomeName', a maximum length of 400 is allowed. Supplied value was 433 characters long.
Keywords imported from FlowJo can be used to link samples and provide metadata. This topic describes how you can edit the values assigned to those keywords and also add additional keywords within LabKey to store additional information.
You can edit keyword input values either individually or in bulk.
Note that if you want to add a new keyword for all rows, but set different values, you would perform multiple rounds of edits. First select all rows to add the new keyword, providing a default/common temporary value for all rows. Then select the subsets of rows with other values for that keyword and set a new value.
New keywords are not always automatically shown in the data grid. To add the new keyword column to the grid:
You can associate sample descriptions with flow data and associate sample columns with FCS keywords as described in this topic.
The flow module uses a sample type named "Samples". You cannot change this name expectation, and the type's properties and fields must be defined and available in your folder before you can upload or link to sample descriptions. Each folder container could have a different definition of the "Samples" sample type, or the definition could be shared at the project or site level.
For example, you might have a given set of samples and need to run a series of flow panels against the same samples in a series of subfolders. Or you might always have samples with the same properties across a site, though each project folder has a unique set of samples.
If you define a "Samples" sample type in the Shared project, you will be able to use the definition in any folder on the site. Each folder will have a distinct set of samples local to the container, i.e. any samples themselves defined in the Shared project will not be exposed in any local flow folder.
Follow the steps below to:
If you define a "Samples" sample type in the top level project, you will be able to use the definition in all subfolders of that project at any level. Each folder will also be able to share the samples defined in the project, i.e. you won't have to import sample descriptions into the local flow folder.
Follow the steps below to:
To create a "Samples" sample type in a folder where it does not already exist (and where you do not want to share the definition), follow these steps.
When finished, click Save to create the sample type.
You will return to the main dashboard where the link Upload Sample Descriptions now reads Upload More Samples.
Once the samples are defined, either by uploading locally, or at the project-level, you can associate the samples with FCS files using one or more sample join fields. These are properties of the sample that need to match keywords of the FCS files.
Sample Property | FCS Property |
---|---|
"TubeName" | "Name" |
Return to the flow dashboard and click the link ## sample descriptions (under "Assign additional meanings to keywords"). You will see which Samples and FCSFiles could be linked, as well as the values used to link them.
Scroll down for sections of unlinked samples and/or files, if any. Reviewing the values for any entries here can help troubleshoot any unexpected failures to link and identify the right join fields to use.
You will now see a new Sample column in the FCSFile table and can add it to your view:
This topic covers information helpful in writing some flow-specific queries. Users who are new to custom queries should start with this section of the documentation:
LabKey SQL provides the "Statistic" method on FCS tables to allow calculation of certain statistics for FCS data.For this example, we create a query called "StatisticDemo" based on the FCSAnalyses dataset. Start from your Flow demo folder, such as that created during the Flow Tutorial.
The default SQL simply selects all the columns:
SELECT FCSAnalyses.Name,
FCSAnalyses.Flag,
FCSAnalyses.Run,
FCSAnalyses.CompensationMatrix
FROM FCSAnalyses
SELECT FCSAnalyses.Name,
FCSAnalyses.Flag,
FCSAnalyses.Run,
FCSAnalyses.CompensationMatrix,
FCSAnalyses.Statistic."Count"
FROM FCSAnalyses
You can flip back and forth between the source, data, and xml metadata for this query using the tabs in the query editor.
From the "Source" tab, to see the generated query, either view the "Data" tab, or click Execute Query. To leave the query editor, click Save & Finish.
The resulting table includes the "Count" column on the right:
View this query applied to a more complex dataset. The dataset used in the Flow Tutorial has been slimmed down for ease of use. A larger, more complex dataset can be seen in this table:
It is possible to calculate a suite of statistics for every well in an FCS file using an INNER JOIN technique in conjunction with the "Statistic" method. This technique can be complex, so we present an example to provide an introduction to what is possible.
For this example, we use the FCSAnalyses table in the Peptide Validation Demo. We create a query called "SubsetDemo" using the "FCSAnalyses" table in the "flow" schema and edit it in the SQL Source Editor.
SELECT
FCSAnalyses.FCSFile.Run AS ASSAYID,
FCSAnalyses.FCSFile.Sample AS Sample,
FCSAnalyses.FCSFile.Sample.Property.PTID,
FCSAnalyses.FCSFile.Keyword."WELL ID" AS WELL_ID,
FCSAnalyses.Statistic."Count" AS COLLECTCT,
FCSAnalyses.Statistic."S:Count" AS SINGLETCT,
FCSAnalyses.Statistic."S/Lv:Count" AS LIVECT,
FCSAnalyses.Statistic."S/Lv/L:Count" AS LYMPHCT,
FCSAnalyses.Statistic."S/Lv/L/3+:Count" AS CD3CT,
Subsets.TCELLSUB,
FCSAnalyses.Statistic(Subsets.STAT_TCELLSUB) AS NSUB,
FCSAnalyses.FCSFile.Keyword.Stim AS ANTIGEN,
Subsets.CYTOKINE,
FCSAnalyses.Statistic(Subsets.STAT_CYTNUM) AS CYTNUM,
FROM FCSAnalyses
INNER JOIN lists.ICS3Cytokine AS Subsets ON Subsets.PFD IS NOT NULL
WHERE FCSAnalyses.FCSFile.Keyword."Sample Order" NOT IN ('PBS','Comp')
This SQL code leverages the FCSAnalyses table and a list of desired statistics to calculate those statistics for every well.
The "Subsets" table in this query comes from a user-created list called "ICS3Cytokine" in the Flow Demo. It contains the group of statistics we wish to calculate for every well.
Results are available in this table.
LabKey modules expose their data to the LabKey query engine in one or more schemas. This reference topic outlines the schema used by the Flow module to assist you when writing custom Flow queries.
The Flow schema has the following tables in it:
Runs Table | |
This table shows experiment runs for all three of the Flow protocol steps. It has the following columns: | |
RowId | A unique identifier for the run. Also, when this column is used in a query, it is a lookup back to the same row in the Runs table. That is, including this column in a query will allow the user to display columns from the Runs table that have not been explicitly SELECTed into the query |
Flag | The flag column. It is displayed as an icon which the user can use to add a comment to this run. The flag column is a lookup to a table which has a text column “comment”. The icon appears different depending on whether the comment is null. |
Name | The name of the run. In flow, the name of the run is always the name of the directory which the FCS files were found in. |
Created | The date that this run was created. |
CreatedBy | The user who created this run. |
Folder | The folder or project in which this run is stored. |
FilePathRoot | (hidden) The directory on the server's file system where this run's data files come from. |
LSID | The life sciences identifier for this run. |
ProtocolStep | The flow protocol step of this run. One of “keywords”, “compensation”, or “analysis” |
RunGroups | A unique ID for this run. |
AnalysisScript | The AnalysisScript that was used in this run. It is a lookup to the AnalysisScripts table. It will be null if the protocol step is “keywords” |
Workspace | |
CompensationMatrix | The compensation matrix that was used in this run. It is a lookup to the CompensationMatrices table. |
TargetStudy | |
WellCount | The number of FCSFiles that we either inputs or outputs of this run. |
FCSFileCount | |
CompensationControlCount | |
FCSAnalysisCount |
CompensationMatrices Table | |
This table shows all of the compensation matrices that have either been calculated in a compensation protocol step, or uploaded. It has the following columns in it: | |
RowId | A unique identifier for the compensation matrix. |
Name | The name of the compensation matrix. Compensation matrices have the same name as the run which created them. Uploaded compensation matrices have a user-assigned name. |
Flag | A flag column to allow the user to add a comment to this compensation matrix |
Created | The date the compensation matrix was created or uploaded. |
Protocol | (hidden) The protocol that was used to create this compensation matrix. This will be null for uploaded compensation matrices. For calculated compensation matrices, it will be the child protocol “Compensation” |
Run | The run which created this compensation matrix. This will be null for uploaded compensation matrices. |
Value | A column set with the values of compensation matrix. Compensation matrix values have names which are of the form “spill(channel1:channel2)” |
In addition, the CompensationMatrices table defines a method Value which returns the corresponding spill value.
The following are equivalent:
CompensationMatrices.Value."spill(FL-1:FL-2) "
CompensationMatrices.Value('spill(FL-1:FL-2)')
The Value method would be used when the name of the statistic is not known when the QueryDefinition is created, but is found in some other place (such as a table with a list of spill values that should be displayed).
FCSFiles Table | |
The FCSFiles table lists all of the FCS files in the folder. It has the following columns: | |
RowId | A unique identifier for the FCS file |
Name | The name of the FCS file in the file system. |
Flag | A flag column for the user to add a comment to this FCS file on the server. |
Created | The date that this FCS file was loaded onto the server. This is unrelated to the date of the FCS file in the file system. |
Protocol | (hidden) The protocol step that created this FCS file. It will always be the Keywords child protocol. |
Run | The experiment run that this FCS file belongs to. It is a lookup to the Runs table. |
Keyword | A column set for the keyword values. Keyword names are case sensitive. Keywords which are not present are null. |
Sample | The sample description which is linked to this FCS file. If the user has not uploaded sample descriptions (i.e. defined the target table), this column will be hidden. This column is a lookup to the samples.Samples table. |
In addition, the FCSFiles table defines a method Keyword which can be used to return a keyword value where the keyword name is determined at runtime.
FCSAnalyses Table | |
The FCSAnalyses table lists all of the analyses of FCS files. It has the following columns: | |
RowId | A unique identifier for the FCSAnalysis |
Name | The name of the FCSAnalysis. The name of an FCSAnalysis defaults to the same name as the FCSFile. This is a setting which may be changed. |
Flag | A flag column for the user to add a comment to this FCSAnalysis. |
Created | The date that this FCSAnalysis was created. |
Protocol | (hidden) The protocol step that created this FCSAnalysis. It will always be the Analysis child protocol. |
Run | The run that this FCSAnalysis belongs to. Note that FCSAnalyses.Run and FCSAnalyses.FCSFile.Run refer to different runs. |
Statistic | A column set for statistics that were calculated for this FCSAnalysis. |
Graph | A column set for graphs that were generated for this FCSAnalysis. Graph columns display nicely on LabKey, but their underlying value is not interesting. They are a lookup where the display field is the name of the graph if the graph exists, or null if the graph does not exist. |
FCSFile | The FCSFile that this FCSAnalysis was performed on. This is a lookup to the FCSFiles table. |
In addition, the FCSAnalyses table defines the methods Graph, and Statistic.
CompensationControls Table | |||||||
The CompensationControls table lists the analyses of the FCS files that were used to calculate compensation matrices. Often (as in the case of a universal negative) multiple CompensationControls are created for a single FCS file. The CompensationControls table has the following columns in it: | |||||||
RowId | A unique identifier for the compensation control | ||||||
Name | The name of the compensation control. This is the channel that it was used for, followed by either “+”, or “-“ | ||||||
Flag | A flag column for the user to add a comment to this compensation control. | ||||||
Created | The date that this compensation control was created. | ||||||
Protocol | (hidden) | ||||||
Run | The run that this compensation control belongs to. This is the run for the compensation calculation, not the run that the FCS file belongs to. | ||||||
Statistic | A column set for statistics that were calculated for this compensation control. The following statistics are calculated for a compensation control:
|
||||||
Graph | A column set for graphs that were generated for this compensation control. The names of graphs for compensation controls are of the form:
comp(channelName) or comp(<channelName>) The latter is shows the post-compensation graph. |
In addition, the CompensationControls table defines the methods Statistic and Graph.
AnalysisScripts Table | |
The AnalysisScripts table lists the analysis scripts in the folder. This table has the following columns: | |
RowId | A unique identifier for this analysis script. |
Name | The user-assigned name of this analysis script |
Flag | A flag column for the user to add a comment to this analysis script. |
Created | The date this analysis script was created. |
Protocol | (hidden) |
Run | (hidden) |
Analyses Table | |
The Analyses table lists the experiments in the folder with the exception of the one named Flow Experiment Runs. This table has the following columns: | |
RowId | A unique identifier |
LSID | (hidden) |
Name | |
Hypothesis | |
Comments | |
Created | |
CreatedBy | |
Modified | |
ModifiedBy | |
Container | |
CompensationRunCount | The number of compensation calculations in this analysis. It is displayed as a hyperlink to the list of compensation runs. |
AnalysisRunCount | The number of runs that have been analyzed in this analysis. It is displayed as a hyperlink to the list of those run analyses |
The LabKey flow module supports importing and exporting analyses as a series of .tsv and supporting files in a zip archive. The format is intended to be simple for tools to reformat the results of an external analysis engine for importing into LabKey. Notably, the analysis definition is not included in the archive, but may be defined elsewhere in a FlowJo workspace gating hierarchy, an R flowCore script, or be defined by some other software package.
From the flow Runs or FCSAnalysis grid, you can export the analysis results including the original FCS files, keywords, compensation matrices, and statistics.
To import a flow analysis archive, perhaps after making changes outside the server to add different statistics, graphs, or other information, follow these steps:
In brief, the archive format contains the following files:
<root directory>
├─ keywords.tsv
├─ statistics.tsv
│
├─ compensation.tsv
├─ <comp-matrix01>
├─ <comp-matrix02>.xml
│
├─ graphs.tsv
│
├─ <Sample Name 01>/
│ └─ <graph01>.png
│ └─ <graph02>.svg
│
└─ <Sample Name 02>/
├─ <graph01>.png
└─ <graph02>.pdf
All analysis tsv files are optional. The keywords.tsv file lists the keywords for each sample. The statistics.tsv file contains summary statistic values for each sample in the analysis grouped by population. The graphs.tsv contains a catalog of graph images for each sample where the image format may be any image format (pdf, png, svg, etc.) The compensation.tsv contains a catalog of compensation matrices. To keep the directory listing clean, the graphs or compensation matrices may be grouped into sub-directories. For example, the graph images for each sample could be placed into a directory with the same name as the sample.
The ACS container format is not sufficient for direct import to LabKey. The ACS table of contents only includes relationships between files and doesn’t include, for example, the population name and channel/parameter used to calculate a statistic or render a graph. If the ACS ToC could include those missing metadata, the graphs.tsv would be made redundant. The statistics.tsv would still be needed, however.
If you have analyzed results tsv files bundled inside an ACS container, you may be able to extract portions of the files for reformatting into the LabKey flow analysis archive zip format, but you would need to generate the graphs.tsv file manually.
The statistic value is a either an integer number or a double. Count stats are integer values >= 0. Percentage stats are doubles in the range 0-100. Other stats are doubles. If the statistic is not present for the given sample and population, it is left blank.
Short Name | Long Name | Parameter | Type |
---|---|---|---|
Count | Count | n/a | Integer |
% | Frequency | n/a | Double (0-100) |
%P | Frequency_Of_Parent | n/a | Double (0-100) |
%G | Frequency_Of_Grandparent | n/a | Double (0-100) |
%of | Frequency_Of_Ancestor | ancestor population name | Double (0-100) |
Min | Min | channel name | Double |
Max | Max | channel name | Double |
Median | Median | channel name | Double |
Mean | Mean | channel name | Double |
GeomMean | Geometric_Mean | channel name | Double |
StdDev | Std_Dev | channel name | Double |
rStdDev | Robust_Std_Dev | channel name | Double |
MAD | Median_Abs_Dev | channel name | Double |
MAD% | Median_Abs_Dev_Percent | channel name | Double (0-100) |
CV | CV | channel name | Double |
rCV | Robust_CV | channel name | Double |
%ile | Percentile | channel name and percentile 1-99 | Double (0-100) |
For example, the following are valid statistic names:
Sample | Population | Statistic | Value |
---|---|---|---|
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | %P | 0.85 |
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2- | Count | 12001 |
Sample2.fcs | S/L/Lv/3+/{escaped/slash} | Median(FITC-A) | 23,000 |
Sample2.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | %ile(<Pacific-Blue>:30) | 0.93 |
Sample | S/L/Lv/3+/4+/IFNg+IL2+:Count | S/L/Lv/3+/4+/IFNg+IL2+:%P | S/L/Lv/3+/4+/IFNg+IL2-:%ile(<Pacific-Blue>:30) | S/L/Lv/3+/4+/IFNg+IL2-:%P |
---|---|---|---|---|
Sample1.fcs | 12001 | 0.93 | 12314 | 0.24 |
Sample2.fcs | 13056 | 0.85 | 13023 | 0.56 |
Sample | Population | Count | %P | Median(FITC-A) | %ile(<Pacific-Blue>:30) |
---|---|---|---|---|---|
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | 12001 | 0.93 | 45223 | 12314 |
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2- | 12312 | 0.94 | 12345 | |
Sample2.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | 13056 | 0.85 | 13023 | |
Sample2.fcs | S/L/Lv/{slash/escaped} | 3042 | 0.35 | 13023 |
Sample | Population | Parameter | Count | %P | Median | %ile(30) |
---|---|---|---|---|---|---|
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | 12001 | 0.93 | |||
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | FITC-A | 45223 | |||
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | <Pacific-Blue> | 12314 |
The graphs.tsv file is a catalog of plot images generated by the analysis. It is similar to the statistics file and lists the sample name, plot file name, and plot parameters. Currently, the only plot parameters included in the graphs.tsv are the population and x and y axes. The graph.tsv file contains one graph image per row. The population column is encoded in the same manner as in the statistics.tsv file. The graph column is the colon-concatenated x and y axes used to render the plot. Compensated parameters are surrounded with <> angle brackets. (Future formats may split x and y axes into separate columns to ease parsing.) The path is a relative file path to the image (no “.” or “..” is allowed in the path) and the image name is usually just an MD5-sum of the graph bytes.
Multi-sample or multi-plot images are not yet supported.
Sample | Population | Graph | Path |
---|---|---|---|
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | <APC-A> | sample01/graph01.png |
Sample1.fcs | S/L/Lv/3+/4+/IFNg+IL2- | SSC-A:<APC-A> | sample01/graph02.png |
Sample2.fcs | S/L/Lv/3+/4+/IFNg+IL2+ | FSC-H:FSC-A | sample02/graph01.svg |
... |
Sample | Path |
---|---|
Sample1.fcs | compensation/matrix1 |
Sample2.fcs | compensation/matrix2.xml |
Sample | Keyword | Value |
---|---|---|
Sample1.fcs | $MODE | L |
Sample1.fcs | $DATATYPE | F |
... |
For Flow data to be added to a study, it must include participant/timepoint ids or specimen ids, which LabKey Server uses to align the data into a structured, longitudinal study. The topic below describes four available mechanisms for supplying these ids to Flow data.
Add keywords to the flow data before importing them into LabKey. If your flow data already has keywords for either the SpecimenId or for ParticipantId and Timepoints, then you can link the Flow data into a study without further modification.
You can add keywords to an FCS file in most acquisition software, such as FlowJo. You can also add keywords in the WSP file, which LabKey will pick up. Use this method when you have control over the Flow source files, and if it is convenient to change them before import.
If your flow data does not already contain the appropriate keywords, you can add them after import to LabKey Server. Note this method does not change the original FCS or WSP files. The additional keyword data only resides inside LabKey Server. Use this method when you cannot change the source Flow files, or when it is undesirable to do so.
This method extends the information about the flow samples to include participant ids, visits, etc. It uses a sample type as a mapping table, associating participant/visit metadata with the flow vials.
For example, if you had Flow data like the following:
TubeName | PLATE_ID | FlowJoFileID | and so on... |
---|---|---|---|
B1 | 123 | 4110886493 | ... |
B2 | 345 | 3946880114 | ... |
B3 | 789 | 8693541319 | ... |
You could extend the fields with a sample type like
TubeName | PTID | Date | Visit |
---|---|---|---|
B1 | 202 | 2009-01-18 | 3 |
B2 | 202 | 2008-11-23 | 2 |
B3 | 202 | 2008-10-04 | 1 |
You can manually add participant/visit data as part of the link-to-study wizard. For details see Link Assay Data into a Study.
The keywords.jar file attached to this page is a simple commandline tool to dump the keywords from a set of FCS files. Used together with findstr or grep this can be used to search a directory of fcs files.
Download the jar file: keywords.jar
The following will show you all the 'interesting' keywords from all the files in the current directory (most of the $ keywords are hidden).
java -jar keywords.jar *.fcs
The following will show the EXPERIMENT ID, Stim, and $Tot keywords for each fcs file. You may need to escape the '$' on linux command line shells.
java -jar keywords.jar -k "EXPERIMENT ID,Stim,$Tot" *.fcs
For tabular output suitable for import into excel or other tools, use the "-t" switch:
java -jar keywords.jar -t -k "EXPERIMENT ID,Stim,$Tot" *.fcs
To see a list of all options:
java -jar keywords.jar --help