1
votes

I am using facet_grid to plot my time series data. The data is

> dput(mel.ob)
structure(list(timestamp = structure(c(1438450200, 1438536600, 
1438623000, 1438709400, 1438795800, 1438882200, 1438968600, 1439055000, 
1439141400, 1439227800, 1439314200, 1439400600, 1439487000, 1439573400, 
1439659800, 1439746200, 1439832600, 1439919000, 1440005400, 1440091800, 
1440178200, 1440264600, 1440351000, 1440437400, 1440523800, 1440610200, 
1440696600, 1440783000, 1440869400, 1440955800, 1438450200, 1438536600, 
1438623000, 1438709400, 1438795800, 1438882200, 1438968600, 1439055000, 
1439141400, 1439227800, 1439314200, 1439400600, 1439487000, 1439573400, 
1439659800, 1439746200, 1439832600, 1439919000, 1440005400, 1440091800, 
1440178200, 1440264600, 1440351000, 1440437400, 1440523800, 1440610200, 
1440696600, 1440783000, 1440869400, 1440955800, 1438450200, 1438536600, 
1438623000, 1438709400, 1438795800, 1438882200, 1438968600, 1439055000, 
1439141400, 1439227800, 1439314200, 1439400600, 1439487000, 1439573400, 
1439659800, 1439746200, 1439832600, 1439919000, 1440005400, 1440091800, 
1440178200, 1440264600, 1440351000, 1440437400, 1440523800, 1440610200, 
1440696600, 1440783000, 1440869400, 1440955800), tzone = "Asia/Kolkata", tclass = c("POSIXct", 
"POSIXt"), class = c("POSIXct", "POSIXt")), variable = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("power", "hpanom", 
"lofanom"), class = "factor"), value = c(172.016104060554, 176.777480085691, 
184.018404140075, 175.561151940955, 182.52419107159, 175.216664665937, 
183.751597162088, 198.844153259955, 199.803173963254, 237.508030500042, 
285.079349749013, 188.377162014776, 452.238984323895, 304.084650686052, 
304.976941431231, 168.37194477982, 221.142072661718, 285.339264474312, 
243.126828978721, 526.165040140682, 583.26909929249, 549.145660841621, 
195.02748863608, 200.088289825199, 249.279407493724, 410.968041439378, 
368.949028046264, 361.528117646774, 394.092273548577, 439.027137154341, 
0.190453461153838, 0.738170304350057, 0.359277651161948, 0.383363598976019, 
0.357189854750563, 0.357189854750563, 0, 0.464407941461156, 0.842226206120729, 
0.928056670115148, 0.939184368487052, 0.174074829364281, 0.999333003990622, 
0.97052094947291, 0.957985395010343, 0.620128340774666, 0.971218262867733, 
0.918581736843709, 0.898790693128374, 0.992626480647862, 0.996099376857962, 
0.995219939905799, 0.864283999224187, 0.903098686478643, 0.929581519648184, 
0.98981186152571, 0.986686711459769, 0.989957071504958, 0.984688509451126, 
0.986320878558335, 0.02, 0.1, 0.03, 0.02, 0.07, 0.02, 0, 0.04, 
0.05, 0.45, 0.11, 0.01, 1, 0.1, 0.13, 0.03, 0.72, 0.13, 0.59, 
0.54, 0.72, 0.52, 0.08, 0.07, 0.14, 0.2, 0.15, 0.15, 0.17, 0.18
)), row.names = c(NA, -90L), .Names = c("timestamp", "variable", 
"value"), class = "data.frame")

I am using following code for plotting

f <- ggplot(data=mel.ob,aes(x=timestamp,y=value,ymin=0,ymax=value))+facet_grid(variable~., scales = "free_y")+
  theme(axis.title.x=element_blank(),axis.title.y=element_blank())
f1 <- f + geom_linerange(subset=.(variable=="hpanom"))  # require(plyr) for dot function
f2 <- f1 + geom_linerange(subset=.(variable=="lofanom"))  
f3 <- f2 + geom_line(subset=.(variable=="power")) 
f3

On plotting I do get following plot enter image description here

So the question is my data set contains data from dates 1 August to 30 August, but the plot shows data from 2 august to 31 August. Why and how is my dataset getting shifted by one day? Where I am doing wrong?

1

1 Answers

4
votes

The plot isn't actually shifting your dates. Your dates are in POSIXct format, which includes hours, minutes and seconds. ggplot2 puts each point at the exact hour/minute/second of each date. All of your times are 23 hours. So, the lines that look like they're at, say, August 3 are actually at the 23rd hour of August 2nd. Likewise, the leftmost line is at the 23rd hour of August 1st.

If you expand the horizontal extent of your graph and look closely, you'll see that vertical lines at the major grid lines are each shifted slightly (i.e., by 1 hour) to the left of the major grid lines, which are at midnight of each day.

You can have points plotted by day (without regard to hour) by using as.Date(timestamp) in your code. Another option is to keep the date/time format, but place the major grid lines exactly where you want. For example, here's how you would place grid lines each week starting at August 1 at 23:00 hours, but set the labels to be just the date without the time:

f3 + scale_x_datetime(breaks=seq(min(mel.ob$timestamp), max(mel.ob$timestamp), 
                                 by="1 week"),
                      labels=as.Date(seq(min(mel.ob$timestamp), max(mel.ob$timestamp), 
                                     by="1 week")))

Just as an additional note, date/time formats in R are really just numeric variables with a date/time class added. Class POSIXct is the number of seconds since 1/1/1970 00:00:00 (in the UTC time zone), while class Date is the number of days since 1/1/1970. ggplot is plotting these numerical values but with breaks and labels appropriate for the corresponding date classes. You can see the underlying numeric values by doing as.numeric(mel.ob$timestamp) and as.numeric(as.Date(mel.ob$timestamp)).