I have been trying to use dplry for some rather involved data manipulation and have come across this issue for which I cannot find an answer. This is the first question I've asked on StackOverflow. It may be more of a grouped data.table issue rather than dplyr but here goes.
data(iris)
df <- iris %>% group_by(Species) %>%
do((function(x) {
print(names(x))
print(class(x))
if(x$Species[1] == 'setosa') x$Petal.Length <- x$Petal.Length+1
return(x)
})(.))
Here I can still access the grouping variable because the data is left as a data.frame and the subgroup inside the do is also a data.frame. Obviously the whole x$Species column will be the same value within the do and so this work around presents itself. It seems to me like this 'Current Group' might be a useful value to be able to access. When converting to a data.table however:
dt <- iris %>% tbl_dt() %>%
mutate(Species2 = Species) %>%
group_by(Species) %>%
do((function(x) {
print(names(x))
print(class(x))
print( attr(x, 'vars'))
print(groups(x))
if(x$Species2[1] == 'setosa') x$Petal.Length <-x$Petal.Length + 1
return(x)
})(.)) %>%
the subgroup x is a grouped data.table and the grouping variable is dropped from the subgroup. I've included a copy of the grouping column and reference that copy from within the do, but I feel like there should/could be a more elegant way to refer to the specific group that the do is currently working with.