0
votes

I have a dataframe, dates, which contains lists of dates. I'm trying to produce a plot of one of the lists, dates$t2, binned into weeks. (i.e. how many of the dates fall within each consecutive week.)

I'd like each of the binned weeks to have a label in the format of e.g. 01-Nov, 08-Nov, ..., with the range limited to the 'minimimum' and 'maximum' (earliest and latest) dates in the list.

So far I have created a dataframe with the list of dates I want to bin, data$t2, and a series of columns which (I assume!) I'll need to create my x-labels:

library(lubridate)

dates$t1 <- c("24/07/2015", "12/08/2015", "10/08/2015", "05/09/2015", "20/09/2015", 
"23/09/2015", "07/09/2015", "04/11/2015", "03/11/2015", "14/10/2015", 
"08/10/2015", "14/09/2015", "02/10/2015", "28/09/2015", "23/10/2015", 
"02/11/2015", "28/11/2015", "06/12/2015", "10/12/2015", "08/12/2015", 
"07/12/2015", "03/12/2015", "21/11/2015", "02/12/2015", "12/12/2015", 
"28/12/2015", "13/01/2016", "14/01/2016", "03/01/2016", "24/01/2016"
)

dates$t1 <- dmy(dates$t1)
dates$t2 <- dates$t1 + years(1)
dates$day = day(dates$t2)
dates$week = isoweek(dates$t2)
dates$month = month(dates$t2, label = TRUE)
dates$year = year(dates$t2)

dates <- na.omit(dates)

So far, I think, so good. Dataframe looks like this:

> head(dates)
          t1         t2 day week month year
1 2015-07-24 2016-07-24  24   29   Jul 2016
2 2015-08-12 2016-08-12  12   32   Aug 2016
3 2015-08-10 2016-08-10  10   32   Aug 2016
4 2015-09-05 2016-09-05   5   36   Sep 2016
5 2015-09-20 2016-09-20  20   38   Sep 2016
6 2015-09-23 2016-09-23  23   38   Sep 2016

> str(dates)
'data.frame':   30 obs. of  6 variables:
 $ t1   : Date, format: "2015-07-24" "2015-08-12" "2015-08-10" "2015-09-05" ...
 $ t2   : Date, format: "2016-07-24" "2016-08-12" "2016-08-10" "2016-09-05" ...
 $ day  : int  24 12 10 5 20 23 7 4 3 14 ...
 $ week : int  29 32 32 36 38 38 36 44 44 41 ...
 $ month: Ord.factor w/ 12 levels "Jan"<"Feb"<"Mar"<..: 7 8 8 9 9 9 9 11 11 10 ...
 $ year : num  2016 2016 2016 2016 2016 ...
 - attr(*, "na.action")=Class 'omit'  Named int [1:18] 30 32 33 34 35 36 37 38 39 40 ...
  .. ..- attr(*, "names")= chr [1:18] "30" "32" "33" "34" ...

I'm stuck however when it comes to binning and plotting. I'm stuck at this point:

ggplot(dates, aes(x = week)) +
  geom_bar()

Is anyone able to advise how to:

  1. Replace the week numbers on the x-axis with a day-month (e.g. 01-Nov) format?
  2. Tell ggplot that week numbers span two different years e.g. 1-10 belong at the start of 2017, not as currently displayed at the start of 2016.
  3. Set the x-axis limits to range to the earliest and latest dates in the list, not a full year.

I'm still very new to R, any help is much appreciated, thanks!

1

1 Answers

1
votes

You can make a week calendar that contains the first date of each week of each year:

library(dplyr)
data.frame(date=seq(as.Date("2015/1/1"), as.Date("2016/12/31"), by="day")) %>% 
  mutate(week=isoweek(date),year=year(date)) %>%
  group_by(year,week) %>% 
  summarise(weekdate=min(date)) -> week_calendar

then you merge it with your data.frame

dates <- merge(dates,week_calendar)

And after you can plot with

library(ggplot2)
ggplot(dates, aes(x = weekdate)) +
  geom_bar()+
  scale_x_date(date_breaks = "1 week", date_labels = "%d-%b")+
  theme(axis.text.x = element_text(angle = 90))