Example Transformation Scripts (perl)

2024-03-28

This page includes examples of two types of transformation script written using perl. The use cases demonstrated here are:

Modify Run Data

This script populates a new field with data derived from an existing field in the run.

#!/usr/local/bin/perl
use strict;
use warnings;


# Open the run properties file. Run or upload set properties are not used by
# this script. We are only interested in the file paths for the run data and
# the error file.

open my $reportProps, '${runInfo}';

my $transformFileName = "unknown";
my $dataFileName = "unknown";

my %transformFiles;

# Parse the data file properties from reportProps and save the transformed data location
# in a map. It's possible for an assay to have more than one transform data file, although
# most will only have a single one.

while (my $line=<$reportProps>)
{
chomp($line);
my @row = split(/\t/, $line);

if ($row[0] eq 'runDataFile')
{
$dataFileName = $row[1];

# transformed data location is stored in column 4

$transformFiles{$dataFileName} = $row[3];
}
}

my $key;
my $value;
my $adjustM1 = 0;

# Read each line from the uploaded data file and insert new data (double the value in the M1 field)
# into an additional column named 'Adjusted M1'. The additional column must already exist in the assay
# definition and be of the correct type.

while (($key, $value) = each(%transformFiles)) {

open my $dataFile, $key or die "Can't open '$key': $!";
open my $transformFile, '>', $value or die "Can't open '$value': $!";

my $line=<$dataFile>;
chomp($line);
$line =~ s/\r*//g;
print $transformFile $line, "\t", "Adjusted M1", "\n";

while (my $line=<$dataFile>)
{
$adjustM1 = substr($line, 27, 3) * 2;
chomp($line);
$line =~ s/\r*//g;
print $transformFile $line, "\t", $adjustM1, "\n";

}

close $dataFile;
close $transformFile;
}

Modify Run Properties

You can also define a transform script that modifies the run properties, as show in this example which parses the short filename out of the full path:

#!/usr/local/bin/perl
use strict;
use warnings;

# open the run properties file, run or upload set properties are not used by
# this script, we are only interested in the file paths for the run data and
# the error file.

open my $reportProps, $ARGV[0];

my $transformFileName = "unknown";
my $uploadedFile = "unknown";

while (my $line=<$reportProps>)
{
chomp($line);
my @row = split(/\t/, $line);

if ($row[0] eq 'transformedRunPropertiesFile')
{
$transformFileName = $row[1];
}
if ($row[0] eq 'runDataUploadedFile')
{
$uploadedFile = $row[1];
}
}

if ($transformFileName eq 'unknown')
{
die "Unable to find the transformed run properties data file";
}

open my $transformFile, '>', $transformFileName or die "Can't open '$transformFileName': $!";

#parse out just the filename portion
my $i = rindex($uploadedFile, "\\") + 1;
my $j = index($uploadedFile, "
.xls");

#add a value for fileID

print $transformFile "
FileID", "\t", substr($uploadedFile, $i, $j-$i), "\n";
close $transformFile;

Related Topics