1
votes

Problem: Why does ggplot2 removes 2 rows when plotting the below data? In particular, it removes the first and the last row although the specified x-axis date range (2020-01-01 until 2020-01-03) should contain all values?

Warning Message:

Warning message:
Removed 2 rows containing missing values
(geom_bar). 

Code:

library(ggplot2)
library(scales)

dt_object <- data.table(
  loc = c(rep("A", 3), rep("B", 3)),
  dt = rep(seq.Date(as.Date("2020-01-01"), as.Date("2020-01-03"), length.out = 3), 2),
  vals = c(500, 200, 100, 1000, 400, 300)
)

ggplot(dt_object, aes(x = dt, y = vals, fill = loc))+
  geom_bar(position = "dodge", stat = "identity")+
  scale_x_date(date_breaks = "1 month", 
               labels=date_format("%b %Y"),
               limits = as.Date(c("2020-01-01", "2020-01-03")))

Edit: I know that I can specify the x-axis wider (e.g. 2019-12-31, 2020-01-04) but I would like to have the exact specified date range as in my question as input to ggplot

1
That's a pretty unpleasant behaviour.. you should open an issue on github.com/tidyverse/ggplot2/issues - Edo
Probably, the issues is generated from the fact that a date class is not intended to use as X axis in a barplot.. Usually you would use a line chart. That's the appropriate way to plot a time series. Anyway, I upvoted because I believe you're raising a good point. - Edo

1 Answers

1
votes

The bars get dropped because they are parameterised as rectangles and some of the rectangle corners fall outside the date limits. To circumvent this, you can use the coordinate limits. In non-date x-scales, you could also set oob = scales::oob_keep to do the same (but for some reason oob is not an argument to date scales).

library(ggplot2)
library(scales)
library(data.table)
#> 
#> Attaching package: 'data.table'
#> The following object is masked from 'package:ggplot2':
#> 
#>     :=

dt_object <- data.table(
  loc = c(rep("A", 3), rep("B", 3)),
  dt = rep(seq.Date(as.Date("2020-01-01"), as.Date("2020-01-03"), length.out = 3), 2),
  vals = c(500, 200, 100, 1000, 400, 300)
)

ggplot(dt_object, aes(x = dt, y = vals, fill = loc))+
  geom_bar(position = "dodge", stat = "identity")+
  scale_x_date(date_breaks = "1 month", 
               labels=date_format("%b %Y")) +
  coord_cartesian(xlim = as.Date(c("2020-01-01", "2020-01-03")))

Created on 2020-09-11 by the reprex package (v0.3.0)

Note that the dates created span 3 days instead of 3 months, so that is probably why the axis labelling looks weird.

EDIT: The same issue occurs when using numeric scales, so this is just an out of bounds (oob) issue.

ggplot(dt_object, aes(x = as.numeric(dt), y = vals, fill = loc))+
  geom_bar(position = "dodge", stat = "identity")+
  scale_x_continuous(limits = as.numeric(as.Date(c("2020-01-01", "2020-01-03"))))