Looking for help on writing a Perl program that takes an input file and performs manipulations based on follow-up commands. I'm a beginning Perl student so please don't get too advance in suggestions. The structure that I have so far is a main program and 4 subs.
I'm having trouble with two parts:
Writing the portion of the main segment that creates a unique record for each line from the input file (which is fixed width format). I think this should be done with substr but I don't know much more of how this should be structured. Unpack is beyond the scope of my learning so far.
One of the functions called in the main program is a "distance" sub which will calculate distance between atoms. I'm thinking this should be a For Loop inside a For loop. Any thoughts on what approach I should take?
The records should store an array of atom records (one record/atom per newline):
• The atom's serial number, 5 digits. (cols 7 - 11)
• The three-letter name of the amino acid to which it belongs (cols 18 - 20)
• The atom's three coordinates real number as decimal & Orthogonal Coordinates (x,y,z) (cols 31 - 54 )
For X in Angstroms cols. 31-38
For Y in Angstroms cols. 39-46
For Z in Angstroms cols. 47-54
• The atom's one- or two-letter element name (e.g. C, O, N, Na) (cols 77-78 )
sub Distance
# take an array of atom records and return the max distance
# between all pairs of atoms in that array. (cols 31-54)
Here is sample text from an input file.
# truncating for testing purposes. Actual data is aprox. 100 columns
# and starts with ATOM or HETATM
__DATA__
ATOM 4743 CG GLN A 704 19.896 32.017 54.717 1.00 66.44 C
ATOM 4744 CD GLN A 704 19.589 30.757 55.525 1.00 73.28 C
ATOM 4745 OE1 GLN A 704 18.801 29.892 55.098 1.00 75.91 O
Here is what I have so far for the main and sub for make records. I hate to be lame but I don't have anything to show for the Distance sub yet so don't worry about giving code, any suggestions on how to approach would be very appreciated.
use warnings;
use strict;
my @fields;
my @recs;
while ( <DATA> ) {
chomp;
@fields = split(/\s+/);
push @recs, makeRecord(@fields);
}
for (my $i = 0; $i < @recs; $i++) {
printRec( $recs[$i] );
}
my %command_table = (
freq => \&freq,
length => \&length,
density => \&density,
help => \&help,
quit => \&quit
);
print "Enter a command: ";
while ( <STDIN> ) {
chomp;
my @line = split( /\s+/);
my $command = shift @line;
if ($command !~ /^freq$|^density$|length|^help$|^quit$/ ) {
print "Command must be: freq, length, density or quit\n";
}
else {
$command_table{$command}->();
}
print "Enter a command: ";
}
sub makeRecord
# Read the entire line and make records from the lines that contain the
# word ATOM or HETATM in the first column. Not sure how to do this:
{
my %record =
(
serialnumber => shift,
aminoacid => shift,
coordinates => shift,
element => [ @_ ]
);
return\%record;
}
unpack
out of your scope of learning when you're making use of a dispatch table? – Zaidunpack
is that you have to get the template right. You are given the start and end columns for each field you have to read. The@
code jumps to the desired starting position, except it's zero-based, so to jump to column 7, for example, you can use@6
, and then to jump to column 18, you can use@17
. That way you don't have to count the spaces you are skipping. – Narvesonunpack
is at perldoc.perl.org/functions/unpack.html. – Narveson