1
votes

Hi I need to achieve something like this :

grp value   diff
1   10       NA  # diff[1] = value[2]-value[0] of grp = 1
1   15       10  # diff[2] = value[3]-value[1] of grp = 1
1   20       -5  # diff[3] = value[4]-value[2] of grp = 1
1   10       NA  # diff[4] = value[5]-value[3] of grp = 1
2   25       NA  # diff[5] = value[6]-value[4] of grp = 2
2   30       10  # diff[6] = value[7]-value[5] of grp = 2
2   35       NA  # diff[7] = value[8]-value[6] of grp = 2

I have tried using functions like shift and lag but cannot get this type of solution where I take the difference of preceding values and subtract them, where it is as diff[i] = value[i+1] - value[i-1]

Using for loop is running into errors, so is there a better way to do this?

2
Also, # diff[3] = value[4]-value[3] of grp = 1, I think you are looking for # diff[3] = value[4]-value[2] of grp = 1, am I right ?dc37
@akrun yes sorry for the little typo, it's diff[3] = value[4]-value[2]heisenbug29

2 Answers

3
votes

I think you have a typo in the description of your differential values. However, if you want that diff[i] to be the value[i+1] - value[i-1], you can do it by using both lead and lag in dplyr

library(dplyr)
df %>% group_by(grp) %>% mutate(diff = lead(value) -lag(value))

# A tibble: 7 x 3
# Groups:   grp [2]
    grp value  diff
  <dbl> <dbl> <dbl>
1     1    10    NA
2     1    15    10
3     1    20    -5
4     1    10    NA
5     2    25    NA
6     2    30    10
7     2    35    NA

EDIT: Absolute difference

If you need the absolute difference, you can do:

df %>% group_by(grp) %>% mutate(diff = abs(lead(value) -lag(value)))

# A tibble: 7 x 3
# Groups:   grp [2]
    grp value  diff
  <dbl> <dbl> <dbl>
1     1    10    NA
2     1    15    10
3     1    20     5
4     1    10    NA
5     2    25    NA
6     2    30    10
7     2    35    NA

Does it look what you are looking for ?

Data

df = data.frame(grp = c(rep(1,4),rep(2,3)),
                value = c(10,15,20,10,25,30,35))
2
votes

After grouping by 'grp', we can get the difference of lead with 'value' and take the lag of it

library(dplyr)
df1 %>%
  group_by(grp)
  mutate(diff = lag(abs(lead(value, 2) - value)))
# A tibble: 7 x 3
# Groups:   grp [2]
#    grp value  diff
#  <int> <int> <int>
#1     1    10    NA
#2     1    15    10
#3     1    20     5
#4     1    10    NA
#5     2    25    NA
#6     2    30    10
#7     2    35    NA

data

df1 <- structure(list(grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), value = c(10L, 
15L, 20L, 10L, 25L, 30L, 35L)), row.names = c(NA, -7L), class = "data.frame")