1
votes

I have a dataset where the x-axis is a date, but it is only mm-dd (no year). I am using year as a group variable as I am trying to show a YOY change on the same plot. The x-axis labeling is too crowded and I'd like to limit the tick mark labels so that not every date is shown. This could be every other day, every third day, one day a week -- any of these would work.

I have tried a few solutions but cannot get them to work, I'm assuming because my x-axis is not a Date, but a character. (Previous to arriving at this mm-dd solution for the x-axis, I tried plotting the x-axis with a yyyy-mm-dd Date format, but was unsuccessful in figuring out how to get ggplot2 to ignore the "yyyy" part.)

An example:

myDF <- data.frame(
        myDate = format(seq(as.Date("2014-02-01"), 
                 length=28, by="1 day"), "%m-%d"),
        myVar = sample(100,28),
        myGroup = sample(2,28,TRUE)
                   )
head(myDF)

  myDate myVar myGroup
  02-01    87       1
  02-02    34       1
  02-03    48       2
  02-04    59       1
  02-05    98       1
  02-06    18       2

ggplot(myDF, aes(myDate, myVar, group=myGroup, color=as.factor(myGroup))) + 
geom_line() 

Everything is correct here except the tick labels are too squished.

I have tried:

ggplot(myDF, aes(myDate, myVar, group=myGroup, color=as.factor(myGroup))) +   
geom_line() + scale_x_discrete(breaks = c(1,10,20))

This appears to confuse ggplot since the labels disappear completely. (Same result with a seq() attempt.)

I have also tried:

ggplot(myDF, aes(myDate, myVar, group=myGroup, color=as.factor(myGroup))) + 
geom_line() + scale_x_date(breaks = "1 week")

This throws an error re: myDate not being a Date.

I've already switched the format of the tick labels to be vertical, but it is still too crowded on the plot.

Any tips would be very much appreciated. Thanks!

1
Arrrgh. It's the old as.data.frame(cbind(.)) error yet again. cbind coerces everything to character. And if you want Dates, then don't coerce to character with format.IRTFM
The as.data.frame(cbind(.)) is not in my original code, it is just for the quick and dirty example. I admit in my question that the mm-dd variable is no longer a Date, and I explained how I have arrived at that format. I do not claim it is the correct way to do things -- that's why I'm here.Samantha
If you had took your time to read the help text for ?scale_x_date, a function which you used, then you would have found several examples on how to "control the format of the labels, and the frequency of [...] tickmarks".Henrik
I did read ?scale_x_date. It takes Dates. My mm-dd is not a Date. If you are saying that mm-dd can be a Date, please share your knowledge. All of my research has lead me to believe that at a minimum, ymd is required for Dates. If you are instead saying that I can use scale_x_date() with a yyyy-mm-dd Date var so that I have two lines in the same horizontal space, effectively ignoring yyyy, please share your knowledge of that as well. It would be relevant in that scenario only with additional tweaking of some other aesthetic I do not understand.Samantha

1 Answers

6
votes

If you want to use myDate variable without the year (as character) then one solution would be to use scale_x_discrete() and then provide myDF$myDate as breaks= argument and select sequence of values you want to show. In this example I selected every 7th value.

ggplot(myDF, aes(myDate, myVar, group=myGroup, color=as.factor(myGroup))) +   
      geom_line() + scale_x_discrete(breaks = unique(myDF$myDate)[seq(1,28,7)])