2
votes

When you plot something using ggplot2, it warns you if it auto-removes missings.
I would love to be able to disable that specific warning or to set the default of na.rm to true system-wide, but that's not possible AFAIK.

I know I can disable it by specifying na.rm=T for each geom that I use. But this fails when ggplot generates further geoms that I don't explicitly specify. In the example below I would get three warnings per plot using my original data (10 when I facet it, so you can see this gets annoying in a knitr report). I can suppress two warnings with na.rm=T, but the third one about geom_segment I can't. Incidentally it also occurs with mtcars, so I used that as an example.

Warning message: Removed 23 rows containing missing values (geom_segment).

ggplot(data=mtcars, aes(x = disp, y = wt)) + 
    geom_linerange(stat = "summary", fun.data = "median_hilow", colour = "#aec05d", na.rm=T) + 
    geom_pointrange(stat = "summary", fun.data = "mean_cl_boot", colour = "#6c92b2", na.rm=T)

Until I figure this out I can use warning=FALSE for the offending chunks, but I don't really like that since it might suppress warnings that I do care about. I could also use na.omit on the dataset but that's a lot of work and syntax of figuring out which variables I'll use in the plot.

1
you can ignore the warning by "suppressWarnings(expr)". If you want to not receive any warning anymore , you can do "options(warn=-1)"user1267127
e.g., define print.ggplot <- function(x, newpage = is.null(vp), vp = NULL, ...) suppressWarnings(ggplot2:::print.ggplot(x, newpage, vp, ...)), However that might still suppress warnings you care about. AFAIK there is no reliable way to supress specific warnings, since localization would have to be considered. In fact with this specific warning, you should consider carefully if your plot is sensible. The reader won't know if the interval wasn't drawn or is so small that it isn't visible.Roland
@Roland & Nemo: I'm aware I can suppress all warnings, I'd prefer to programmatically omit missings though. My real plot is sensible and interpretable, this is just a toy example.Ruben

1 Answers

2
votes

I guess the only way to avoid this is not to use stat_summary, but calculate the summary statistics yourself. For your example that's no problem, but I'll admit that this is not a very satisfactory solution in general.

# load dplyr package used to calculate summary
require(dplyr)
# calculate summary statistics
df <- mtcars %>% group_by(disp) %>% do(mean_cl_boot(.$wt))
# use geom_point and geom_segment with na.rm=TRUE
ggplot(data=mtcars, aes(x = disp, y = wt)) + 
  geom_linerange(stat = "summary", fun.data = "median_hilow", colour = "#aec05d") + 
  geom_point(data = df, aes(x = disp, y = y), colour = "#6c92b2") +
  geom_segment(data = df, aes(x = disp, xend = disp, y = ymin, yend = ymax), colour = "#6c92b2", na.rm=TRUE) 

Alternatively, you can write your own version of mean_cl_boot. If ymin or ymax are NA just set them to the value of y.

# your summary function 
my_mean_cl_boot <- function(x, ...){
  res <- mean_cl_boot(x, ...)
  res[is.na(res$ymin), "ymin"] <- res[is.na(res$ymin), "y"]
  res[is.na(res$ymax), "ymax"] <- res[is.na(res$ymax), "y"]
  na.omit(res)
}
# plotting command
ggplot(data=mtcars, aes(x = disp, y = wt)) + 
  geom_linerange(stat = "summary", fun.data = "median_hilow", colour = "#aec05d", na.rm=T) + 
  geom_pointrange(stat = "summary", fun.data = "my_mean_cl_boot", colour = "#6c92b2", na.rm=T)