This page includes examples of two types of transformation script written using perl. The use cases demonstrated here are:
Modify Run Data
This script populates a new field with data derived from an existing field in the run.
#!/usr/local/bin/perl
use strict;
use warnings;
# Open the run properties file. Run or upload set properties are not used by
# this script. We are only interested in the file paths for the run data and
# the error file.
open my $reportProps, '${runInfo}';
my $transformFileName = "unknown";
my $dataFileName = "unknown";
my %transformFiles;
# Parse the data file properties from reportProps and save the transformed data location
# in a map. It's possible for an assay to have more than one transform data file, although
# most will only have a single one.
while (my $line=<$reportProps>)
{
chomp($line);
my @row = split(/\t/, $line);
if ($row[0] eq 'runDataFile')
{
$dataFileName = $row[1];
# transformed data location is stored in column 4
$transformFiles = $row[3];
}
}
my $key;
my $value;
my $adjustM1 = 0;
# Read each line from the uploaded data file and insert new data (double the value in the M1 field)
# into an additional column named 'Adjusted M1'. The additional column must already exist in the assay
# definition and be of the correct type.
while (($key, $value) = each(%transformFiles)) {
open my $dataFile, $key or die "Can't open '$key': $!";
open my $transformFile, '>', $value or die "Can't open '$value': $!";
my $line=<$dataFile>;
chomp($line);
$line =~ s/\r*//g;
print $transformFile $line, "\t", "Adjusted M1", "\n";
while (my $line=<$dataFile>)
{
$adjustM1 = substr($line, 27, 3) * 2;
chomp($line);
$line =~ s/\r*//g;
print $transformFile $line, "\t", $adjustM1, "\n";
}
close $dataFile;
close $transformFile;
}
Modify Run Properties
You can also define a transform script that modifies the run properties, as show in this example which parses the short filename out of the full path:
#!/usr/local/bin/perl
use strict;
use warnings;
# open the run properties file, run or upload set properties are not used by
# this script, we are only interested in the file paths for the run data and
# the error file.
open my $reportProps, $ARGV[0];
my $transformFileName = "unknown";
my $uploadedFile = "unknown";
while (my $line=<$reportProps>)
{
chomp($line);
my @row = split(/\t/, $line);
if ($row[0] eq 'transformedRunPropertiesFile')
{
$transformFileName = $row[1];
}
if ($row[0] eq 'runDataUploadedFile')
{
$uploadedFile = $row[1];
}
}
if ($transformFileName eq 'unknown')
{
die "Unable to find the transformed run properties data file";
}
open my $transformFile, '>', $transformFileName or die "Can't open '$transformFileName': $!";
#parse out just the filename portion
my $i = rindex($uploadedFile, "\\") + 1;
my $j = index($uploadedFile, ".xls");
#add a value for fileID
print $transformFile "FileID", "\t", substr($uploadedFile, $i, $j-$i), "\n";
close $transformFile;
Related Topics