0
votes

I have a data.frame with two variables. I need to group them by var1 and replace every x in var2 with the unique different value in that group.

For example:

   var1 var2
1     1    a
2     2    a
3     2    x
4     3    b
5     4    c
6     5    a
7     6    c
8     6    x
9     7    c
10    8    x
11    8    b
12    8    b
13    9    a

Outcome should be:

   var1 var2
1     1    a
2     2    a
3     2    a <-
4     3    b
5     4    c
6     5    a
7     6    c
8     6    c <-
9     7    c
10    8    b <-
11    8    b
12    8    b
13    9    a

I did manage to solve this example:

dat <- data.frame(var1=c(1,2,2,3,4,5,6,6,7,8,8,8,9), var2=c("a","a","x","b","c","a","a","x","c","x","b","b","a"))

dat %>% group_by(var1) %>% mutate(
  var2 = as.character(var2),
  var2 = ifelse(var2 == 'x',var2[order(var2)][1],var2))

But this does not work for my real data because of the ordering :(

I would need another approach, I think of something like checking explicit for "not x" but I did not came to a solution.

Any help appreciatet!

1
What if you have two or more unique non-x values within same group? Like var1 = c(10,10,10); var2=c(“a”,”b”,”x”). What if x is the only value in a group?Pav El

1 Answers

2
votes

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'var1', we get the 'var2' that are not 'x', select the first observation and assign (:=) it to 'var2'.

library(data.table)
setDT(df1)[, var2 := var2[var2!='x'][1], var1]

Or with dplyr

library(dplyr)
df1 %>%
  group_by(var1) %>%
  mutate(var2 = var2[var2!="x"][1])
#    var1  var2
#   <int> <chr>
#1      1     a
#2      2     a
#3      2     a
#4      3     b
#5      4     c
#6      5     a
#7      6     c
#8      6     c
#9      7     c
#10     8     b
#11     8     b
#12     8     b
#13     9     a