4
votes

I would like to define similar functions as in the 'broom' package

library(dplyr)
library(broom)

mtcars %>% 
  group_by(am) %>% 
  do(model = lm(mpg ~ wt, .)) %>% 
  glance(model)

works fine. But how do I defne custom functions like

myglance <- function(x, ...) {
  s <- summary(x)
  ret <- with(s, data.frame(r2=adj.r.squared, a=coefficients[1], b=coefficients[2]))
  ret
}


mtcars %>% 
  group_by(am) %>% 
  do(model = lm(mpg ~ wt, .)) %>% 
  myglance(model)

Error in eval(substitute(expr), data, enclos = parent.frame()) : invalid 'envir' argument of type 'character'

2

2 Answers

3
votes

glance works this way because the broom package defines a method for rowwise data frames here. If you were willing to bring in that whole .R file (along with the col_name utility from here), you could use my code to do the same thing:

myglance_df <- wrap_rowwise_df(wrap_rowwise_df_(myglance))

mtcars %>% 
  group_by(am) %>% 
  do(model = lm(mpg ~ wt, .)) %>% 
  myglance_df(model)

There's also a workaround that doesn't require adding so much code from broom: change the class of each of your models, and define your own glance function on that class.

glance.mylm <- function(x, ...) {
  s <- summary(x)
  ret <- with(s, data.frame(r2=adj.r.squared, a=coefficients[1], b=coefficients[2]))
  ret
}

mtcars %>% 
  group_by(am) %>% 
  do(model = lm(mpg ~ wt, .)) %>% 
  mutate(model = list(structure(model, class = c("mylm", class(model))))) %>%
  glance(model)

Finally, you also have the option of performing myglance on the model right away.

mtcars %>% 
  group_by(am) %>% 
  do(myglance(lm(mpg ~ wt, .)))
1
votes

Here is my take on how it would work, basically the approach would be:

  1. Extract the appropriate column from the dataframe (My solution is based on this answer, there must be a better way, and I hope someone will correct me!

  2. run lapply on the result and construct the variables that you wanted in the myglance function you have above.

  3. run do.call with rbind to return a data.frame.


myglance <- function(df, ...) {
  # step 1
  s <- collect(select(df, ...))[[1]] # based on this answer: https://stackoverflow.com/a/21629102/1992167

  # step 2
  lapply(s, function(x) {
    data.frame(r2 = summary(x)$adj.r.squared,
               a = summary(x)$coefficients[1],
               b = summary(x)$coefficients[2])
  }) %>% do.call(rbind, .) # step 3
}

Output:

> mtcars %>% 
+   group_by(am) %>% 
+   do(model = lm(mpg ~ wt, .)) %>%
+   myglance(model)
         r2        a         b
1 0.5651357 31.41606 -3.785908
2 0.8103194 46.29448 -9.084268