0
votes

How do I plot this time samples with a string (W) inside (column 3) ?

And How do I control xtics increment with time format ?

Few lines of Data :

France,FR,2020-W09,118,3318,67012883,4.95128675481698,3.55635925256178,TESSy
France,FR,2020-W10,996,11101,67012883,16.5654714482288,8.97216466984956,TESSy
France,FR,2020-W11,4297,29623,67012883,44.2049329529667,14.5056206326166,TESSy
France,FR,2020-W12,10595,73235,67012883,109.28495644636,14.4671263740015,TESSy
France,FR,2020-W13,24156,122870,67012883,183.35280396756,19.6598030438675,TESSy
France,FR,2020-W14,30304,127029,67012883,189.55907329043,23.8559698966378,TESSy
France,FR,2020-W15,24925,140316,67012883,209.386604065371,17.7634767239659,TESSy

My script :

#https://www.ecdc.europa.eu/en/publications-data/covid-19-testing

#Data (105,77K) here :
system("wget https://opendata.ecdc.europa.eu/covid19/testing/csv -P $PWD -O testing.csv")

reset
set term wxt font ',11' size 1200,800

set datafile separator ","
set grid
#set key at screen 0.9, 0.9


timefmt = "%Y-%s%W"
set xdata time
set xtics format timefmt timedate rotate by -45
SECPERWEEK = 3600.*24.*7.
Y_W(col) = timecolumn(col,timefmt) + SECPERWEEK * (strcol(col)[2:3] - 1)

plot '< grep France testing.csv' u (Y_W(3)):4 notitle w l

enter image description here

Thank you

2

2 Answers

3
votes

Here is a suggestion how I would do it. It's maybe not obvious and looks maybe a bit complicated, but it is a gnuplot-only solution. Since I do not run Linux, I do not have grep, that's why I define myFilter() in gnuplot itself which is platform independent. Everytime this filter gives a hit, the counter t will be increased by one which has the advantage that the data can contain a interlaced mix of countries. I assume that's what grep would allow as well. The only assumption here is that the week numbers are in (ascending) order, they would not be sorted.

I guess here it is not necessary to have the x-axis as timeformat. The situation would be different if there are missing calendar week(s) and you want to keep an according gap for them. With myOffset=0 and myEvery=2 you set how many x-tic labels you want to have displayed. There is certainly room for improvement and I'm sure there are other solutions... so, just as a starting point...

Code:

### plot filtered data with custom xtics
reset session

$Data <<EOD
France,FR,2020-W09,118,3318,67012883,4.95128675481698,3.55635925256178,TESSy
France,FR,2020-W10,996,11101,67012883,16.5654714482288,8.97216466984956,TESSy
France,FR,2020-W11,4297,29623,67012883,44.2049329529667,14.5056206326166,TESSy
Luxembourg,LU,2020-W11,11,222,33333333,44.4444444444444,55.5555555555555,fghij
Luxembourg,LU,2020-W12,11,222,33333333,44.4444444444444,55.5555555555555,fghij
France,FR,2020-W12,10595,73235,67012883,109.28495644636,14.4671263740015,TESSy
France,FR,2020-W13,24156,122870,67012883,183.35280396756,19.6598030438675,TESSy
Belgium,BE,2020-W13,1111,222222,33333333,444.44444444444,55.5555555555555,abcde
Belgium,BE,2020-W14,1111,222222,33333333,444.44444444444,55.5555555555555,abcde
France,FR,2020-W14,30304,127029,67012883,189.55907329043,23.8559698966378,TESSy
France,FR,2020-W15,24925,140316,67012883,209.386604065371,17.7634767239659,TESSy
EOD

set datafile separator comma
set datafile missing NaN
set xtics rotate by -45

myFilter(dcol,fcol,key) = strcol(fcol) eq key ? (t=t+1, column(dcol)) : NaN
myXtic(col) = sprintf("%s",(t+myOffset)% myEvery ? "" : strcol(col))
myKey = 'France'
myOffset = 0
myEvery = 2

plot t=1 $Data u (t):(myFilter(4,1,myKey)):xtic(myXtic(3)) w lp pt 7 title myKey
### end of code

Result:

enter image description here

2
votes

The basic error is that the Y_W function is looking in the wrong columns for the week number. It should be substring 7 to 8 not 2 to 3.

Y_W(col) = timecolumn(col,"%Y") + SECPERWEEK * (strcol(col)[7:8])

As explained by theozh in this answer, gnuplot uses American week numbers by default, not ISO 8601, so I have not addressed that here.

plot