I have data where I have one element of interest (111) in a column. My data looks something like this:
pcp2 <- data.frame(A = c(rep(111, 4), rep(222, 5), rep(111,5),
rep(222,5)),B= c(rep(1,9), rep(2,10)))
A B
1 111 1
2 111 1
3 111 1
4 111 1
5 222 1
6 222 1
7 222 1
8 222 1
9 222 1
10 111 2
11 111 2
12 111 2
13 111 2
14 111 2
15 222 2
16 222 2
17 222 2
18 222 2
19 222 2
I want to collapse all of only variable 222 and leave all 111 untouched throughout my data based upon the IDs in column B like so:
A B
1 111 1
2 111 1
3 111 1
4 111 1
5 222 1
6 111 2
7 111 2
8 111 2
9 111 2
10 111 2
11 222 2
All I have been able to find that is close reduces all the variables (both 111 and 222).
library(data.table)
dat <- as.data.table(pcp2, key = "B")
data <- dat[, by = key(dat)][!duplicated(A == "222")]
as follows:
A B
1 111 1
2 222 2
I've played around with various iterations of this code and others, but other things I've tried as well have just gotten my data down to two rows or collapsed both 111 and 222. I.e. this wouldn't be sufficient:
A B
1 111 1
2 222 2
3 111 2
4 222 2
Does anyone have any guidance for how you would maintain the 111 and collapse the 222 within one column based on another column like the example (B in this case)? I know other examples are similar but none seem to give the option to specify not collapsing a particular variable within one column while collapsing the other(s).
pcp2[7, 2] = 2
- do you expect to get 3 rows withA = 222
at the end, or just 2? – eddi