Analysis Archive Format: /Documentation

Analysis Archive Format

Premium Feature — Available with all Premium Editions of LabKey Server. Learn more or contact LabKey.

The LabKey flow module supports importing and exporting analyses as a series of .tsv and supporting files in a zip archive. The format is intended to be simple for tools to reformat the results of an external analysis engine for importing into LabKey. Notably, the analysis definition is not included in the archive, but may be defined elsewhere in a FlowJo workspace gating hierarchy, an R flowCore script, or be defined by some other software package.

Export an Analysis Archive
Import an Analysis Archive
Analysis Archive Format

Statistics File
Graphs File
Compensation File
Keywords File

Export an Analysis Archive

From the flow Runs or FCSAnalysis grid, you can export the analysis results including the original FCS files, keywords, compensation matrices, and statistics.

Open the analysis and select the runs to export.
Select (Export).
Click the Analysis tab.

Make the selections you need and click Export.

Import an Analysis Archive

To import a flow analysis archive, perhaps after making changes outside the server to add different statistics, graphs, or other information, follow these steps:

In the flow folder, Flow Summary web part, click Upload and Import.
Drag and drop the analysis archive into the upload panel.
Select the archive and click Import Data.
In the popup, confirm that Import External Analysis is selected.

Click Import.

Analysis Archive Format

In brief, the archive format contains the following files:

 <root directory>
 ├─ keywords.tsv
 ├─ statistics.tsv
 │
 ├─ compensation.tsv
 ├─ <comp-matrix01>
 ├─ <comp-matrix02>.xml
 │
 ├─ graphs.tsv
 │
 ├─ <Sample Name 01>/
 │  └─ <graph01>.png
 │  └─ <graph02>.svg
 │
 └─ <Sample Name 02>/
    ├─ <graph01>.png
    └─ <graph02>.pdf

All analysis tsv files are optional. The keywords.tsv file lists the keywords for each sample. The statistics.tsv file contains summary statistic values for each sample in the analysis grouped by population. The graphs.tsv contains a catalog of graph images for each sample where the image format may be any image format (pdf, png, svg, etc.) The compensation.tsv contains a catalog of compensation matrices. To keep the directory listing clean, the graphs or compensation matrices may be grouped into sub-directories. For example, the graph images for each sample could be placed into a directory with the same name as the sample.

ACS Container Format

The ACS container format is not sufficient for direct import to LabKey. The ACS table of contents only includes relationships between files and doesn’t include, for example, the population name and channel/parameter used to calculate a statistic or render a graph. If the ACS ToC could include those missing metadata, the graphs.tsv would be made redundant. The statistics.tsv would still be needed, however.

If you have analyzed results tsv files bundled inside an ACS container, you may be able to extract portions of the files for reformatting into the LabKey flow analysis archive zip format, but you would need to generate the graphs.tsv file manually.

Statistics File

The statistics.tsv file is a tab-separated list of values containing stat names and values. The statistic values may be grouped in a few different ways: (a) no grouping (one statistic value per line), (b) grouped by sample (each column is a new statistic), (c) grouped by sample and population (the current default encoding), or (d) grouped by sample, population, and channel.

Sample Name

Samples are identified by the value in the sample column so must be unique in the analysis. Usually the sample name is just the FCS file name including the ‘.fcs’ extension (e.g., “12345.fcs”).

Population Name

The population column is a unique name within the analysis that identifies the set of events that the statistics were calculated from. A common way to identify the statistics is to use the gating path with gate names separated by a forward slash. If the population name starts with “(” or contains one of “/”, “{”, or “}” the population name must be escaped. To escape illegal characters, wrap the entire gate name in curly brackets { }. For example, the population “A/{B/C}” is the sub-population “B/C” of population “A”.

Statistic Name

The statistic is encoded in the column header as statistic(parameter:percentile) where the parameter and percentile portions are required depending upon the statistic type. The statistic part of the column header may be either the short name (“%P”) or the long name (“Frequency_Of_Parent”). The parameter part is required for the frequency of ancestor statistic and for other channel based statistics. The frequency of ancestor statistic uses the name of an ancestor population as the parameter value while the other statistics use a channel name as the parameter value. To represent compensated parameters, the channel name is wrapped in angle brackets, e.g “<FITC-A>”. The percentile part is required only by the “Percentile” statistic and is an integer in the range of 1-99.

The statistic value is a either an integer number or a double. Count stats are integer values >= 0. Percentage stats are doubles in the range 0-100. Other stats are doubles. If the statistic is not present for the given sample and population, it is left blank.

Allowed Statistics

Short Name	Long Name	Parameter	Type
Count	Count	n/a	Integer
%	Frequency	n/a	Double (0-100)
%P	Frequency_Of_Parent	n/a	Double (0-100)
%G	Frequency_Of_Grandparent	n/a	Double (0-100)
%of	Frequency_Of_Ancestor	ancestor population name	Double (0-100)
Min	Min	channel name	Double
Max	Max	channel name	Double
Median	Median	channel name	Double
Mean	Mean	channel name	Double
GeomMean	Geometric_Mean	channel name	Double
StdDev	Std_Dev	channel name	Double
rStdDev	Robust_Std_Dev	channel name	Double
MAD	Median_Abs_Dev	channel name	Double
MAD%	Median_Abs_Dev_Percent	channel name	Double (0-100)
CV	CV	channel name	Double
rCV	Robust_CV	channel name	Double
%ile	Percentile	channel name and percentile 1-99	Double (0-100)

For example, the following are valid statistic names:

Count
Robust_CV(<FITC>)
%ile(<Pacific-Blue>:30)
%of(Lymphocytes)

Examples

NOTE: The following examples are for illustration purposes only.

No Grouping: One Row Per Sample and Statistic

The required columns are Sample, Population, Statistic, and Value. No extra columns are present. Each statistic is on a new line.

Sample	Population	Statistic	Value
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2+	%P	0.85
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2-	Count	12001
Sample2.fcs	S/L/Lv/3+/{escaped/slash}	Median(FITC-A)	23,000
Sample2.fcs	S/L/Lv/3+/4+/IFNg+IL2+	%ile(<Pacific-Blue>:30)	0.93

Grouped By Sample

The only required column is Sample. The remaining columns are statistic columns where the column name contain the population name and statistic name separated by a colon.

Sample	S/L/Lv/3+/4+/IFNg+IL2+:Count	S/L/Lv/3+/4+/IFNg+IL2+:%P	S/L/Lv/3+/4+/IFNg+IL2-:%ile(<Pacific-Blue>:30)	S/L/Lv/3+/4+/IFNg+IL2-:%P
Sample1.fcs	12001	0.93	12314	0.24
Sample2.fcs	13056	0.85	13023	0.56

Grouped By Sample and Population

The required columns are Sample and Population. The remaining columns are statistic names including any required parameter part and percentile part.

Sample	Population	Count	%P	Median(FITC-A)	%ile(<Pacific-Blue>:30)
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2+	12001	0.93	45223	12314
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2-	12312	0.94		12345
Sample2.fcs	S/L/Lv/3+/4+/IFNg+IL2+	13056	0.85		13023
Sample2.fcs	S/L/Lv/{slash/escaped}	3042	0.35	13023

Grouped By Sample, Population, and Parameter

The required columns are Sample, Population, and Parameter. The remaining columns are statistic names with any required percentile part.

Sample	Population	Parameter	Count	%P	Median	%ile(30)
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2+		12001	0.93
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2+	FITC-A			45223
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2+	<Pacific-Blue>				12314

Graphs File

The graphs.tsv file is a catalog of plot images generated by the analysis. It is similar to the statistics file and lists the sample name, plot file name, and plot parameters. Currently, the only plot parameters included in the graphs.tsv are the population and x and y axes. The graph.tsv file contains one graph image per row. The population column is encoded in the same manner as in the statistics.tsv file. The graph column is the colon-concatenated x and y axes used to render the plot. Compensated parameters are surrounded with <> angle brackets. (Future formats may split x and y axes into separate columns to ease parsing.) The path is a relative file path to the image (no “.” or “..” is allowed in the path) and the image name is usually just an MD5-sum of the graph bytes.

Multi-sample or multi-plot images are not yet supported.

Sample	Population	Graph	Path
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2+	<APC-A>	sample01/graph01.png
Sample1.fcs	S/L/Lv/3+/4+/IFNg+IL2-	SSC-A:<APC-A>	sample01/graph02.png
Sample2.fcs	S/L/Lv/3+/4+/IFNg+IL2+	FSC-H:FSC-A	sample02/graph01.svg
...

Compensation File

The compensation.tsv file maps sample names to compensation matrix file paths. The required columns are Sample and Path. The path is a relative file path to the matrix (no “.” or “..” is allowed in the path). The comp. matrix file is in the FlowJo comp matrix file format or a GatingML transforms:spilloverMatrix XML document.

Sample	Path
Sample1.fcs	compensation/matrix1
Sample2.fcs	compensation/matrix2.xml

Keywords File

The keywords.tsv lists the keyword names and values for each sample. This file has the required columns Sample, Keyword, and Value.

Sample	Keyword	Value
Sample1.fcs	$MODE	L
Sample1.fcs	$DATATYPE	F
...

LabKey Support

LabKey Support

Analysis Archive Format

Analysis Archive Format

Export an Analysis Archive

Import an Analysis Archive

Analysis Archive Format

ACS Container Format

Statistics File

Sample Name

Population Name

Statistic Name

Allowed Statistics

Examples

No Grouping: One Row Per Sample and Statistic

Grouped By Sample

Grouped By Sample and Population

Grouped By Sample, Population, and Parameter

Graphs File

Compensation File

Keywords File

Search

Docs & Product Feedback

Was this content helpful?

Log in or register an account to provide feedback

Pages

LabKey Support

LabKey Support

Analysis Archive Format

Analysis Archive Format

Export an Analysis Archive

Import an Analysis Archive

Analysis Archive Format

ACS Container Format

Statistics File

Sample Name

Population Name

Statistic Name

Allowed Statistics

Examples

No Grouping: One Row Per Sample and Statistic

Grouped By Sample

Grouped By Sample and Population

Grouped By Sample, Population, and Parameter

Graphs File

Compensation File

Keywords File

Search

Scope ?

Categories ?

Sort ?

Docs & Product Feedback

Was this content helpful?

Log in or register an account to provide feedback

Pages