0
votes

I have a dataframe with 20 columns. The last column is a text field. I'd like to group by the first 19 columns, and concatenate the values for the last text column. I have been trying to achieve this by using dplyr as follows:

mydf %>%
group_by(col1, col2, col3, ... col19) %>%
summarise(alltasks = c(col20))

But this does not work and returns with this error message:

Error in eval(substitute(expr), envir, enclos) : expecting a single value

What am I doing wrong?

1
c will just return all the values in each group, which isn't a single summary value like summarise requires. Without knowing what it is you want exactly, maybe summarise(alltasks=toString(col20))thelatemail
Try paste or toString.Michael Griffiths

1 Answers

5
votes

As @thelatemail pointed out, you need to return a single value, rather than a vector of all of the values. Here's an example of how to do that:

dots = lapply(paste0("col", 1:19), as.symbol)

mydf %>%
  group_by_(.dots=dots) %>%
  summarise(alltasks = paste(col20, collapse=", "))

The change to the summarise line is how you paste together all the values from col20 into a single string. I used a comma-space separator, but you can of course change that to whatever you prefer. The dots business is just an easier way to combine all the grouping variables without typing out each one.

If the names of mydf aren't so regular that you can use a simple paste function to generate them, other options would be, for example:

dots = lapply(names(mydf)[1:19], as.symbol)

or

dots = lapply(names(mydf)[-grep("col20$", names(mydf))], as.symbol)

Here's a working example of the dplyr code using the built-in mtcars data frame:

dots = lapply(c("am","vs","cyl","carb"), as.symbol)

mtcars %>%
  group_by_(.dots=dots) %>%
  summarise(all_mpg = paste(mpg, collapse=", "))
      am    vs   cyl  carb                      all_mpg
1      0     0     8     2       18.7, 15.5, 15.2, 19.2
2      0     0     8     3             16.4, 17.3, 15.2
3      0     0     8     4 14.3, 10.4, 10.4, 14.7, 13.3
4      0     1     4     1                         21.5
5      0     1     4     2                   24.4, 22.8
6      0     1     6     1                   21.4, 18.1
7      0     1     6     4                   19.2, 17.8
8      1     0     4     2                           26
9      1     0     6     4                       21, 21
10     1     0     6     6                         19.7
11     1     0     8     4                         15.8
12     1     0     8     8                           15
13     1     1     4     1       22.8, 32.4, 33.9, 27.3
14     1     1     4     2             30.4, 30.4, 21.4