0
votes

New Perl User.

I'm trying to create a hash table from some tab delimited data stored in a variable $blast_results with the first column as the key. And then I want to compare the the values in an array @filenames with the key in the hash table. If the array value is found in the hash key, I want to print out a re-ordered structure of the $blast_results, if the array value is not in the hash, I want to print out the value followed by 'No Result Found'.

This is what I have thus far, I think the hash table is not correct:

#!/usr/bin/env perl

use strict;
use warnings;
use Data:Dumper;

#create variable to mimic blast results
my $blast_results = "file1.ab1  9   350 0.0 449 418 418 403479  403042  567
file3.ab1   2   833 0.0 895 877 877 3717226 3718105 984";

#create array to mimic filename array
my @filenames = ("file1.ab1", "file2.ab1", "file3.ab1");

#header for file
my $header = "Query\tSeq_length\tTarget found\tScore (Bits)\tExpect(E-value)\tAlign-length\tIdentities\tPositives\tChr\tStart\tEnd\n";

#initialize hash
my %hash;
#split blast results into array
my @row = split(/\s+/, $blast_results);
$hash{$row[0]}=$_;
print $header;
foreach my $file (@filenames){
    ## If this filename has an associated entry in the hash, print it in a re-ordered format
    if(defined($hash{$file})){
        print "$row[0]\t$row[9]\t$row[1]:$row[7]-$row[8]\t$row[2]\t$row[3]\t$row[4]\t$row[5]\t$row[6]\t$row[1]\t$row[7]\t$row[8]\n";
        }
    ## If not, print this.
    else{
        print "$file\t0\tNo Blast Results: Sequencing Rxn Failed\n";
        }
    }
print "-----------------------------------\n";      
print "$blast_results\n"; #test what results look like
print "-----------------------------------\n"; 
print "$row[0]\t$row[1]\n"; #test if array is getting split correctly
print "-----------------------------------\n"; 
print "$filenames[2]\n"; #test if other array present
print "-----------------------------------\n";
print Dumper(\%hash);  #print out hash table

The result from this script is (the @filenames array is not matching the hash and the hash does not contain all of the data):

Query   Seq_length  Target found    Score (Bits)    Expect(E-value) Align-length    Identities  Positives   Chr Start   End
file1.ab1   0   No Blast Results: Sequencing Rxn Failed
file2.ab1   0   No Blast Results: Sequencing Rxn Failed
file3.ab1   0   No Blast Results: Sequencing Rxn Failed
-----------------------------------
file1.ab1   9   350 0.0 449 418 418 403479  403042  567
file3.ab1   2   833 0.0 895 877 877 3717226 3718105 984
-----------------------------------
file1.ab1   9
-----------------------------------
file3.ab1
-----------------------------------
$VAR1 = {
      'file1.ab1' => undef
        };
1

1 Answers

0
votes

You're doing something very odd just here:

my %hash;
#split blast results into array
my @row = split(/\s+/, $blast_results);
$hash{$row[0]}=$_;

Your hash will have a single key - of 'file1.ab1' and... I can't see what $_ will be at this point, but it's almost certainly what you want. (From your dumper output at the end, it's undef.)

At no point to you put anything else into this hash, which is why you get the result you do.

Is by any chance blast_results normally read from a file?

If so, something like this might work:

use strict;
use warnings;
use Data::Dumper;
my %hash;

while (<DATA>) {
    my @line     = split;
    my $filename = shift(@line);
    $hash{$filename} = join( " ", @line );
}

print Dumper \%hash;


__DATA__
file1.ab1  9   350 0.0 449 418 418 403479  403042  567
file3.ab1   2   833 0.0 895 877 877 3717226 3718105 984

If you're getting results from a command instead - why not try:

open ( my $blast_results, "|-", "blastn -query test.fa -outfmt 6" );
while ( <$blast_results> ) { 
    #parse line into hash
}