Example Workflow: LabKey and PacBio

_Documentation
This topic describes a LabKey Server workflow for managing samples and sequencing results generated from a PacBio Sequencer. This workflow lets you:
  • import sequencing data from a PacBio Sequencer, including sample sheets, barcode identifiers, and table structures
  • link sample sets and runs and organize with other information such as ids, types, tags, etc.
  • keep different pool data (such as from different lanes) separate
  • store multiple fastq files per barcode identifier, and accept fastq files without assuming the need for matching read counts
  • browse and export the sequence files

Set Up a Dashboard

First we will set up a Genotyping dashboard, and import some sample data.

  • Download pacbio.lists.zip -- This is a list archive you will import - do not unzip it.
  • Create a new project of type Genotyping. Use the default settings.
  • In the Lists webpart click Manage Lists.
  • Click Import List Archive.
  • Choose or Browse to the pacbio.lists.zip archive and click Import List Archive.
  • You will see the lists imported.
  • (Optional) Explore the list design of the samples list to notice that the fivemid and threemid columns are configured as lookups into the mids list.

Next, configure the necessary queries and load reference sequences:

  • Click Genotyping Dashboard.
  • Under Settings, click Admin.
  • Under Configure Genotyping Queries, click Configure next to Runs:
    • Schema = lists
    • Query = runs
    • View = [default View]
    • Click Submit
  • Click Configure next to Samples:
    • Schema = lists
    • Query = samples
    • View = [default View]
    • Click Submit.
  • Click Submit again to save the query configuration.

Load some sample data:

  • Download and unzip FilesFromPacBioInstrument.zip - the PacBio sample data - to the location of your choice.
  • Click Genotyping Dashboard.
  • Under Tasks, click Import Run.
  • Drag and drop the pacbio8 folder into the upload area (located in the FilesFromPacBioInstrument package you downloaded).
  • Navigate to and select a SampleSheet.csv file. You can find one in each pool of fastq files in the sample data you just uploaded. For instance: pacbio8/pool1_barcoded_fastqs/SampleSheet.csv
  • In the pop-up, scroll down to select Import PacBio Reads and click Import.
  • Select the Associated Run, and optionally provide a FASTQ Prefix.
  • Click Import Reads.
  • Evaluate any errors received. For example, the error "Failure to send success notification, but job has completed successfully" can be disregarded.
  • Click Genotyping Dashboard when the import is complete.
  • Click View Runs and then the run number to see the (small) results from this sample import.
  • Click a Sample ID to see the samples associated with this run.


previousnext
 
expand allcollapse all