1
votes

I'm trying to plot data as a function of time using gnuplot. I am having an issue with the time data (x-axis) being incorrect. This issue is similar to the one posted here, but that post does not appear to resolve my problem.

To start, here is a subset of file "data.txt" that shows the error

996,1.81014336621038094E+07,1.04721577434964254E+07
997,1.81073887058396861E+07,1.04688883975542113E+07
998,1.81123550412347727E+07,1.04660263576711770E+07
999,1.81165058190760165E+07,1.04628236696091276E+07
1000,1.81200135215993598E+07,1.04593579882744774E+07
1001,1.81230027468293682E+07,1.04556943748914227E+07
1002,1.81256090021481551E+07,1.04518411259850748E+07
1003,1.81280483217409961E+07,1.04478383895292878E+07
1004,1.81311435732491128E+07,1.04439282290004119E+07

The first column corresponds to a Julian date, and columns 2 and 3 contain data. To plot the data, I am using the following interactive gnuplot commands:

set datafile separator ","
set terminal png
set xdata time
set timefmt "%j"
set output "test_figure.png"
plot "data.txt" using 1:2 with lines lw 2 lt 1

This produces the following plot: Figure with incorrect timeseries

I get the correct figure if I alter the data.txt file to be (the only difference is the leading zeros in the first column for the first 4 lines):

0996,1.81014336621038094E+07,1.04721577434964254E+07
0997,1.81073887058396861E+07,1.04688883975542113E+07
0998,1.81123550412347727E+07,1.04660263576711770E+07
0999,1.81165058190760165E+07,1.04628236696091276E+07
1000,1.81200135215993598E+07,1.04593579882744774E+07
1001,1.81230027468293682E+07,1.04556943748914227E+07
1002,1.81256090021481551E+07,1.04518411259850748E+07
1003,1.81280483217409961E+07,1.04478383895292878E+07
1004,1.81311435732491128E+07,1.04439282290004119E+07

Figure with correct timeseries

Is there a way that I can write the gnuplot code to not require the leading zeros? The actual dataset has Julian dates 1 to 10,000, and if I write the data with leading zeros to fill 5 digits (i.e., 00001), I get an "illegal day of year" error.

I did notice that the x-axis tick labels are different between the 2 plots (probably hints to the source of the issue that I am having), but I can't determine what is going wrong.

Note: This "error" only appears when I go from 999 to 1000. Going from Julian date 9 to 10 does not have this out-of-order issue.

Thanks ahead of time for the help!

2

2 Answers

0
votes

I don't know what was causing the original issue where the data was getting reordered when plotting, but I realized that I was interpreting the data incorrectly. The first column wasn't actually a Julian date, but was instead the number of hours since the start date. So, a value of 25 wasn't 25 days into the data but was actually 1 day and 1 hour into the data.

Replacing the first column (counter) with "day-hour":

41-12,1.81014336621038094E+07,1.04721577434964254E+07
41-13,1.81073887058396861E+07,1.04688883975542113E+07
41-14,1.81123550412347727E+07,1.04660263576711770E+07
41-15,1.81165058190760165E+07,1.04628236696091276E+07
41-16,1.81200135215993598E+07,1.04593579882744774E+07
41-17,1.81230027468293682E+07,1.04556943748914227E+07
41-18,1.81256090021481551E+07,1.04518411259850748E+07
41-19,1.81280483217409961E+07,1.04478383895292878E+07
41-20,1.81311435732491128E+07,1.04439282290004119E+07

and then using set timefmt "%j-%H" allowed me to obtain the correct plot.

0
votes

Let's first improve the x-axis labels in order to understand what happens:

set format x "%Y-%m-%d"

Then we increase the resolution of the resulting png, and we plot with linespoints instead of lines only. The script now looks like that:

set datafile separator ","
set terminal png size 1200,600
set xdata time
set timefmt "%j"
set output "test_figure.png"
set format x "%Y-%m-%d"
plot "data.txt" using 1:2 with linespoints lw 2 lt 1

This is the result:

x-axis timestamps with year-month-day

There are some points in April 1970 and some points in September 1972. The time format modifier %j means the day of the year. The points in April 1970 correspond to day 100, the points in September 1972 correspond to days about 997, both times counted from January 1, 1970, the Unix epoch.

This means, gnuplot interprets the values 996 ... 999 as days counted from January 1, 1970. The values 1000 ... 1004 are (incorrectly) read as 100 days counted from January 1, 1970, the fourth digit is ignored (!).

If you add a leading 0 in front of the values 996 ... 999, they are now read as 99 which makes things worse.

I stop here as you have already figured out how to read the data :)