0
votes

I want to plot a week-view calendar/timetable/timesheet/waterfall with ggplot. A sample data looks as follows (sampleData.csv):

date, start, end, duration, name, color 
2016-08-04, 00:00:00, 08:00:00, 8.00, idle, #00000000
2016-08-04, 08:00:00, 10:00:00, 2.00, Coding, red
2016-08-04, 10:00:00, 14:00:00, 4.00, idle, #00000000
2016-08-04, 14:00:00, 17:30:00, 3.50, Laundry, green
2016-08-04, 17:30:00, 20:00:00, 2.50, Cooking, blue
2016-08-04, 20:00:00, 24:00:00, 4.00, idle, #00000000
2016-08-05, 00:00:00, 06:00:00, 6.00, idle, #00000000
2016-08-05, 06:00:00, 09:00:00, 3.00, Cooking, blue
2016-08-05, 09:00:00, 10:00:00, 1.00, Laundry, green
2016-08-05, 10:00:00, 12:30:00, 2.50, idle, #00000000
2016-08-05, 12:30:00, 16:00:00, 3.50, Coding, red
2016-08-05, 16:00:00, 22:00:00, 6.00, Basketball, brown
2016-08-05, 22:00:00, 24:00:00, 2.00, idle, #00000000

Currently, I am able to plot them like this:

Correctly ordered but no legend, plus ugly colors.

However, there are 2 disadvantages:

  1. I cannot make a legend, since I didn't tell legend how to group these bars.
  2. Colors had to be given row-by-row. It is indeed awkward to hard-code the style into the data.

To enable the legend and to leave the coloring job to ggplot, I used the aes(fill=name) command. However, ggplot automatically sorted bars at each date according to their "name" values, which ruined my timetable:

With legend but ill-ordered.

Please note in the sample data that:

  • I want to get rid of the "color" column, and let ggplot automatically assign colors to each name.
  • At each date, the sum of "duration" is 24 (hours). This is how I position bars at specific Ys for now. I am open to suggestions about how to "float" bars above the X-axis.
  • At each date, multiple entries with identical "name" field can exist. For example, the "idle" entries starting at 00:00, 10:00 and 20:00. This is one reason why I don't want the bars automatically sorted by the field "name".
  • Between different dates, the order of entries with different "names" can alter -- another reason to NOT automatically sort.

Here is the code generating the two plots above:

library(readr)
data <- read_csv("sampleData.csv", 
                 col_types = cols(date = col_date(format = "%Y-%m-%d"), 
                                  end = col_time(format = "%H:%M:%S"), 
                                  start = col_time(format = "%H:%M:%S")))
library(ggplot2)
# The first way to plot it:
ggplot(data, aes(x = date, y = duration, fill=name)) + 
  geom_bar(stat = "identity") + 
  scale_y_reverse(breaks=0:24)+#function(x) seconds_to_period(x))#strftime(chron(times=c(x/86400)), "%H:%M"))#+coord_flip()
  coord_cartesian(ylim = c(0, 24), expand = FALSE)+
  labs( x = "Date", y = "Time (Hour)",
        title = "Timetable",
        subtitle = "using aes(fill=name)",
        caption = "Legend is plotted and colors are well chosen, but bars at each date are sorted by \"name\" (unwanted).")+
  scale_x_date(date_breaks = "2 month", date_labels = "%b %Y")
# The second way to plot it:
ggplot(data, aes(x = date, y = duration)) + 
  geom_bar(stat = "identity", fill = data$color) + 
  scale_y_reverse(breaks=0:24)+#function(x) seconds_to_period(x))#strftime(chron(times=c(x/86400)), "%H:%M"))#+coord_flip()
  coord_cartesian(ylim = c(0, 24), expand = FALSE)+
  labs( x = "Date", y = "Time (Hour)",
        title = "Timetable",
        subtitle = "using geom_bar(fill=data$color)",
        caption = "Bars at each date are correctly positioned, but legend is not available.")+
  scale_x_date(date_breaks = "2 month", date_labels = "%b %Y")

To state my question in a different way: how can I make a timetable with legend?

1

1 Answers

1
votes

geom_rect() is a better choice on this case than geom_bar():

library(ggplot2)
ggplot(df) +
    geom_rect(aes(xmin = date, xmax = date + .8,
                  ymin = start, ymax = end,
                  fill = name), 
              color = 'black') + 
    scale_y_datetime(date_labels = "%H:%M") + 
    scale_x_date(date_breaks = "2 months", date_labels = "%b %Y") +
    labs(x = "Date", 
         y = "Time (Hour)",
          title = "Timetable"
          )

Data:

df <- read.table(text = 'date, start, end, duration, name, color 
                2016-08-04, 00:00:00, 08:00:00, 8.00, idle, #00000000
                 2016-08-04, 08:00:00, 10:00:00, 2.00, Coding, red
                 2016-08-04, 10:00:00, 14:00:00, 4.00, idle, #00000000
                 2016-08-04, 14:00:00, 17:30:00, 3.50, Laundry, green
                 2016-08-04, 17:30:00, 20:00:00, 2.50, Cooking, blue
                 2016-08-04, 20:00:00, 24:00:00, 4.00, idle, #00000000
                 2016-08-05, 00:00:00, 06:00:00, 6.00, idle, #00000000
                 2016-08-05, 06:00:00, 09:00:00, 3.00, Cooking, blue
                 2016-08-05, 09:00:00, 10:00:00, 1.00, Laundry, green
                 2016-08-05, 10:00:00, 12:30:00, 2.50, idle, #00000000
                 2016-08-05, 12:30:00, 16:00:00, 3.50, Coding, red
                 2016-08-05, 16:00:00, 22:00:00, 6.00, Basketball, brown
                 2016-08-05, 22:00:00, 24:00:00, 2.00, idle, #00000000', header = TRUE, sep = ',')
df$date <- as.Date(df$date)
df$start <- as.POSIXct(df$start, format = "%H:%M:%S")
df$end <- as.POSIXct(df$end, format = "%H:%M:%S")