2
votes

The following perl sub is used to store arrays of hashes. Each hash to be stored is first checked for uniqueness using a given key, if a hash exists on the array with the same key value then it's not stored.

How can this perl sub be optimised for speed?

Example use:

my @members;
...
$member= {};
$hash->{'name'}='James';
hpush('name', \@members,$member);

The sub:

sub hpush {
  # push a set of key value pairs onto an array as a hash, if the key doesn't already exist
  if (@_ != 3) {
    print STDERR "hpush requires 3 args, ".@_." given\n";
    return;
  }

  my $uniq = shift;
  my $rarray = shift;
  my $rhash = shift;
  my $hash = ();

  #print "\nHash:\n";
  for my $key ( keys %{$rhash} ) {
    my $valuea = $rhash->{$key};

    #print "key: $key\n";
    #print "key=>value: $key => $valuea\n";
    $hash->{ $key} = $valuea;
  }

  #print "\nCurrent Array:\n";
  for my $node (@{$rarray}) {
    #print "node: $node \n";
    for my $key ( keys %{$node} ) {
      my $valueb = $node->{$key};
      #print "key=>value: $key => $valueb\n";
      if ($key eq $uniq) {
        #print "key=>value: $key => $valueb\n";
        if (($valueb =~ m/^[0-9]+$/) && ($hash->{$key} == $valueb)) {
          #print "Not pushing i $key -> $valueb\n";
          return;
        } elsif ($hash->{$key} eq $valueb) {
          #print "Not pushing s $key -> $valueb\n";
          return;
        }
      }
    }
  }
  push @{$rarray}, $hash;
  #print "Pushed\n";

}

Note that the perl isn't mine and I'm a perl beginner

1

1 Answers

10
votes

This code is rather... not very efficient. First, it copies $rhash to $hash, with a for loop... for some reason. Then it loops through the hash keys, instead of simply using the hash key that it's looking for. Then it does two equivalent checks, apparently some attempt to distinguish numbers from non-numbers and selecting the appropriate check (== or eq). This is all unnecessary.

This code below should be roughly equivalent. I've trimmed it down hard. This should be as fast as it is possible to get it.

use strict;
use warnings;

hpush('name', \@members,$member);

sub hpush { 
    my ($uniq, $rarray, $rhash) = @_;

    for my $node (@{$rarray}) {
        if (exists $node->{$uniq}) {
            return if ($node->{$uniq} eq $rhash->{$uniq});
        }
    }
    push @{$rarray}, $rhash;
}