Ok, so here's the deal. I am starting my undergrad thesis in computational materials science and I am trying to put some scripts together to help prepare for data analysis.
I have been preparing a GAWK script that will basically take some data (arranged in 4 columns) and snag two of them and plot them in GNUPLOT. To achieve this end, I read in the data file which has multiple timesteps and their associated data in it, split the file into individual .dat files for each timestep.
From there I just generate a basic input script for GNUPLOT and plot each timestep as they occur in the data file.
The issue is that for some reason all of the plots generated are exactly the same plot (in this case always the first time step) but they are saved as the correct timestep.
I have already gone through and tracked each variable/filename throughout the script and finally determined that the issue is somehow with GNUPLOT being called from a script. I took out my system command and wrote a short bash script that calls gnuplot from a for loop:
#!/bin/bash
for file in ./*gnu
do
gnuplot $file
done
And that still causes the same issue where all of the plots are the same. I then just ran the command gnuplot *gnu from the command line in the directory containing the .gnu files and it worked.
I guess I am just wondering if there is some buffer that I need to be flushing or if I am just missing something?
The GAWK script is given below. I am new to this still, so if you wish to comment on the script with some constructive criticism I would greatly appreciate that as well.
#!/opt/local/bin/gawk -v inputf=$1 -f
# Write gnuplot files and plot RDF data
function plot_rdf(timestep, Load_RDF_dat)
{
# Set number of digits in filenames to 6 so data is organized
if (timestep < 10){
pad_timestep="00000"timestep;
}
else if (timestep < 100){
pad_timestep="0000"timestep;
}
else if (timestep < 1000){
pad_timestep="000"timestep;
}
else if (timestep < 10000){
pad_timestep="00"timestep;
}
else if (timestep < 100000){
pad_timestep="0"timestep;
}
else{
pad_timestep=timestep;
}
# Give output filenames
gnu_file="plot_RDF_"pad_timestep".gnu";
png_file="RDF_"pad_timestep".png";
# Create input files for gnuplot
print "set output \""png_file"\"" >> gnu_file;
print "set terminal png" >> gnu_file;
print "plot './"Load_RDF_dat"' u 1:2" >> gnu_file;
close(gnu_file);
system("gnuplot "gnu_file);
}
# Main part of script
{
# Parse the RDF data and save it to GNUPLOT readable files
while(getline < inputf){
if ($1 == "#"){
# skips the three commented header lines
next;
}
else if (NF == 2){
timestep=$1;
bin_num=$2;
print "Reading timestep "timestep;
RDF_dat="RDF_"timestep".dat";
next;
}
else if (NF == 4){
print $2" "$3 >> RDF_dat;
if ($1 == bin_num){
plot_rdf(timestep, RDF_dat);
close(RDF_dat);
}
next;
}
}
close(inputf);
close(RDF_dat);
}
A snippet of the data file I'm reading in is:
# Time-averaged data for fix rdf
# TimeStep Number-of-rows
# Row c_allrdf[1] c_allrdf[2] c_allrdf[3]
500 100
1 0.005 0 0
2 0.015 0 0
3 0.025 0 0
4 0.035 0 0
5 0.045 0 0
6 0.055 1.16597 0.00133333
7 0.065 2.08865 0.00466667
8 0.075 1.56958 0.008
9 0.085 0.733433 0.01
10 0.095 0.587288 0.012
600 100
1 0.005 0 0
2 0.015 0 0
3 0.025 2.79219 0.000666667
4 0.035 2.86766 0.002
5 0.045 0 0.002
6 0.055 0.582985 0.00266667
7 0.065 2.08865 0.006
8 0.075 0.62783 0.00733333
9 0.085 0.488955 0.00866667
10 0.095 1.17458 0.0126667
There is usually 100 sets of data in each timestep section, but I figured I'd shorten here just so you'd get the idea.