3
votes

Working with grouped data, I want to change the last entry in one column to match the corresponding value for that group in another column. So for my data below, for each 'nest' (group), the last 'Status' entry will equal the 'fate' for that nest.

Data like this:

 nest   Status   fate
   1      1       2
   1      1       2
   2      1       3
   2      1       3
   2      1       3

Desired result:

 nest   Status   fate
   1      1       2
   1      2       2
   2      1       3
   2      1       3
   2      3       3

It should be so simple. I tried the following from dplyr and tail to change last value in a group_by in r; it works properly for some groups, but in others it substitutes the wrong 'fate' value:

 library(data.table)
 indx <- setDT(df)[, .I[.N], by = .(nest)]$V1
 df[indx, Status := df$fate]

I get various errors trying this approach dplyr mutate/replace on a subset of rows:

 mutate_last <- function(.data, ...) {
   n <- n_groups(.data)
   indices <- attr(.data, "indices")[[n]] + 1
   .data[indices, ] <- .data[indices, ] %>% mutate(...)
   .data
 }

 df <- df %>%
  group_by(nest) %>%
  mutate_last(df, Status == fate)

I must be missing something simple from the resources mentioned above?

2
Guess that the last line in your data.table attempt should be df[indx, Status := fate]. Try also df[,Status:=c(Status[-.N],fate[.N]),by=nest].nicola

2 Answers

1
votes

Something like

library(tidyverse)

df <- data.frame(nest = c(1,1,2,2,2),
                 status = rep(1, 5),
                 fate = c(2,2,3,3,3))
df %>% 
   group_by(nest) %>% 
   mutate(status = c(status[-n()], tail(fate,1)))
1
votes

Not sure if this is definitely the best way to do it but here's a very simple solution:

library(dplyr)
dat <- data.frame(nest = c(1,1,2,2,2),
                  Status = c(1,1,1,1,1),
                  fate = c(2,2,3,3,3))

dat %>%
  arrange(nest, Status, fate) %>% #enforce order
  group_by(nest) %>%
  mutate(Status = ifelse(is.na(lead(nest)), fate, Status))

E: Made a quick change.