0
votes

I am trying to identify which trials, within a long form dataset, are repeated but only within certain blocks per participant. My data is structured something like this:

sub  block  trial  item
1    1      1      A
1    1      2      B
1    2      1      A
1    2      2      B
1    3      1      B
1    3      2      C
2    1      1      A
2    1      2      B
2    2      1      A
2    2      2      B
2    3      1      B
2    3      2      C

What I would like to create is a new column that indicates for each participant, which items are repeating and another new column with a new trial code, but only if the items are repeated in blocks 2 and 3. So it would look something like this:

sub  block  trial  item   dup      newtrial
1    1      1      A      FALSE    1
1    1      2      B      FALSE    2
1    2      1      A      FALSE    1
1    2      2      B      FALSE    2
1    3      1      C      FALSE    1
1    3      2      B      TRUE     102
2    1      1      A      FALSE    1
2    1      2      B      FALSE    2
2    2      1      A      FALSE    1
2    2      2      B      FALSE    2
2    3      1      C      FALSE    1
2    3      2      B      TRUE     102

I have been able to identify duplicates across the whole dataset and add 100 to each trial number using the following code:

data$dup<-duplicated(data$item)
data$newtrial<-NA

data<-transform(data,
item=make.unique(as.character(item)),
newtrial=ifelse(duplicated(item),trial+100, trial))

What I have not been able to figure out is how to constrain the function to each individual subject and only certain blocks within each subject number.

Thanks!

2
Your desired output does not seem to match your input. Why are those labelled as dup=TRUE duplicates within its sub and block? - aichao

2 Answers

1
votes

another option using data.table:

library(data.table)
xt <- fread("sub  block  trial  item
1    1      1      A
1    1      2      B
1    2      1      A
1    2      2      B
1    3      1      B
1    3      2      B
2    1      1      A
2    1      2      B
2    2      1      A
2    2      2      B
2    3      1      B
2    3      2      B")

xt[,
   c("dup","ntrial") := {
     dup <- duplicated(item)
     tt <- ifelse(dup,trial+100L,trial)
     list(dup,tt)
   },"sub,block"]
0
votes

You can do this using dplyr grouping the observations by sub and block:

library(dplyr)
res <- data %>% group_by(sub,block) %>% 
                mutate(dup=duplicated(item)) %>% 
                ungroup %>%
                mutate(newtrial=ifelse(dup,trial+100,trial))

We use mutate to create new columns dup and newtrial.

Data: Modifying your data slightly to introduce duplicate item for sub=1, block=3 and sub=2, block=3:

data <- structure(list(sub = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L), block = c(1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 
3L), trial = c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L
), item = structure(c(1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 
2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("sub", 
"block", "trial", "item"), class = "data.frame", row.names = c(NA, 
-12L))
##   sub block trial item
##1    1     1     1    A
##2    1     1     2    B
##3    1     2     1    A
##4    1     2     2    B
##5    1     3     1    B
##6    1     3     2    B
##7    2     1     1    A
##8    2     1     2    B
##9    2     2     1    A
##10   2     2     2    B
##11   2     3     1    B
##12   2     3     2    B

Using this data:

print(res)
### A tibble: 12 x 6
##     sub block trial   item   dup newtrial
##   <int> <int> <int> <fctr> <lgl>    <dbl>
##1      1     1     1      A FALSE        1
##2      1     1     2      B FALSE        2
##3      1     2     1      A FALSE        1
##4      1     2     2      B FALSE        2
##5      1     3     1      B FALSE        1
##6      1     3     2      B  TRUE      102
##7      2     1     1      A FALSE        1
##8      2     1     2      B FALSE        2
##9      2     2     1      A FALSE        1
##10     2     2     2      B FALSE        2
##11     2     3     1      B FALSE        1
##12     2     3     2      B  TRUE      102