1
votes

R could be amazingly powerful and frustrating at the same time. This makes teaching R to non-statisticians (business students in my case) rather challenging. Let me illustrate this with a simple task.

Let's say you are working with a monthly time series dataset. Most business data are usually plotted as monthly time series. We would like to plot the data such that the x-axis depicts a combination of month and year. For instance, January 2017 could be depicted as 2017-01. It should be straightforward with the plot command. Not true.

Data Generation Let's illustrate this with an example. I'll generate a random time series of monthly data for 120 observations representing 10 years of information starting in January 2007 and ending in December 2017. Here's the code.

set.seed(1234)
x <- rnorm(120)
d <-.07
y <- cumsum(x+d)*-1

Since we have not declared the data as time series, plotting it with the plot command would not return the intended labels for the x-axis. See the code and the chart below.

plot(y, type="l")

enter image description here

Now there should be an option in the plot or the plot.ts command to display the time series specific x-axis. I couldn't find one. So here's the workaround.

  1. Declare the data set to be time series.
  2. Use tsp and seq to generate the required x-axis labels.
  3. Plot the chart but suppress x-axis.
  4. Use the axis command to add the custom x-axis labels.
  5. Add an extra step to draw a vertical line at 2012.

Here's the code.

my.ts <- ts(y, start=c(2007, 1), end=c(2017, 12), frequency=12)    
tsp = attributes(my.ts)$tsp
dates = seq(as.Date("2007-01-01"), by = "month", along = my.ts)
plot(my.ts, xaxt = "n", main= "Plotting outcome over time",
     ylab="outcome", xlab="time")
axis(1, at = seq(tsp[1], tsp[2], along = my.ts), labels = format(dates, "%Y-%m"))
abline(v=2012, col="blue", lty=2, lwd=2)

The result is charted below. enter image description here

This is a workable solution for most data scientists. But if your audience comprises business students or professionals there are too many lines of code to write.

Question: Is it possible to plot a time series variable (object) using the plot command with the format option controlling how the x-axis will be displayed?

--

3
I can sort of see your issue, but maybe step away from ts. You could use ggplot2 or plot.xts (better plotting than ts), but you will still need to write about the same amount of code. I have the feeling you are looking for something like Tableau or Spotfire for quick drag and drop plotting and have all the visual gimmicks available that management likes to see. But I might be mistaken in my assumptions here.phiver
Have moved comment to an answer.G. Grothendieck
This is basically a reasonable question (and you have two good answers - "find a package that does what you want" or "write a wrapper function that does what you ant"). I would upvote this question if you could make your tone a little more descriptive ("how can I make it easier for my students?") and a little less editorializing (especially the title: "should be simpler") ...Ben Bolker

3 Answers

4
votes

I think the question boils down to wanting a pre-written function for the custom axis you have in mind. Note that plot(my.ts) does give a plot with ticks every month and labels every year which to me looks better than the plot shown in the question but if you want a custom axis since R is a programming language you can certainly write a simple function for that and from then on it's just a matter of calling that function.

For example, to get you started here is a function that accepts a frequency 12 ts object. It draws an X axis with ticks for each month labelling the years and each every'th month where the every argument can be a divisor of 12. The default is 3 so a label for every third month is shown (except Jan which is shown as the year). len is the number of letters of the month shown and can be 1, 2 or 3. 1 means show Jul as J, 2 means Ju and 3 means Jul. The default is 1.

xaxis12 <- function(ser, every = 3, len = 1) {
  tt <- time(ser)
  axis(side = 1, at = tt, labels = FALSE)

  is.every <- cycle(ser) %in% seq(1, 12, every)[-1]
  month.labs <- substr(month.abb[cycle(ser)][is.every], 1, len) 
  axis(side = 1, at = tt[is.every], labels = month.labs, 
    cex.axis = 0.7, tcl = -0.75)

  is.jan <- cycle(ser) == 1
  year.labs <- sprintf("'%02d", as.integer(tt)[is.jan] %% 100)
  axis(side = 1, at = tt[is.jan], labels = year.labs, 
    cex.axis = 0.7, tcl = -1)
}

# test
plot(my.ts, xaxt = "n")
xaxis12(my.ts)

screenshot

3
votes

Gabor is spot-on. It really just depends on what you want, and what you are willing to dig up or alter. Here is a simple alternative using a newer and less-well-known package that is excellent for plotting xts types:

## alternative
library(rtsplot)            # load the plotting package
library(xts)                # load the xts time-series container package
xx <- as.xts(my.ts)         # create an xts object
rtsplot(xx, main= "Plotting outcome over time")
rtsplot.x.highlight(xx, which(index(xx)=="Jan 2012"), 1)

As you can see, the plotting then is two calls -- rtsplot has lots of nice defaults. Below is a screenshot as I am lazy, the plot window does of course not have a title bar...

enter image description here

3
votes

ggplot2 package has the scale_x_date function for plotting time series in desired scales, labels, breaks and limits (day, month, year formats). All you need is date class object and values y. For eg.

dates = seq(as.Date("01-01-2007",  format = "%d-%m-%Y"), length.out = 120, by = "month")
df <- data.frame(dates, y)  

# use the format you need in your plot using scale_x_date
library(ggplot2)
ggplot(df, aes(dates, y)) + geom_line() + scale_x_date(date_labels = "%b-%Y") +
geom_vline(xintercept = as.Date("01-01-2012",  format = "%d-%m-%Y"), linetype = 'dotted', color = 'blue')

y vs dates