4
votes

I have a set of dyad data. So, each person in the dyad was free to switch as many times as they liked between 2 tasks in 5 minutes (300 seconds), and I recorded the time at which each person switched between the tasks.

Participant A   Participant B   
Time    Task    Time    Task 
0       1       0       0
21.43   0      23.08    1
42.86   1      46.16    0
64.29   0      69.24    1
85.72   1      92.32    0
107.15  0      115.4    1
128.58  1     138.48    0
150.01  0     161.56    1
171.44  1     184.64    0
192.87  0     207.72    1
214.3   1     230.8     1
235.73  0     253.88    0
257.16  1     276.96    0
278.59  0       

I hope to transform the data by creating a common timeline for the two individuals, more specifically, having sixty 5-second intervals (making up 300 seconds), and being able to show which task each person is doing at each 5-second interval.

This is an example of how the result should look like (in this example it's in 10-second intervals).


Time PartA  PptB
0      1    0
10     1    0
20     1    0
30     0    1
40     0    1
50     1    0
60     1    0
70     0    1
80     0    1
90     1    1
100    1    0
110    0    0
120    0    1
130    1    1
140    1    0
150    1    0
160    0    0
170    0    1
180    1    1
190    1    0
200    0    0
210    0    1
220    1    1
230    1    1
240    0    1
250    0    1
260    1    0
270    1    0
280    0    0
290    0    0

How can I do this?

2
first, tell us what are the values of Task for A and B in the first 5 seconds?Randy Lai
Hi, I am hoping to create something similar to the lookup function in excel. So I will have a new column with 5-second intervals (0, 5, 10, 15, etc...) And, in the first 5-second interval, it will be 1 for A and 0 for B. For every subsequent 5-second value, it will return with the corresponding Task for A and B based on the original time stamp.user3334446
@user3334446 You can see my edit after your clarification.agstudy

2 Answers

2
votes

For example you can do this :

## read the data as it is shown in the question
dat <- read.table(text='Participant A   Participant B   
Time    Task    Time    Task 
0       1       0       0
21.43   0      23.08    1
42.86   1      46.16    0
64.29   0      69.24    1
85.72   1      92.32    0
107.15  0      115.4    1
128.58  1     138.48    0
150.01  0     161.56    1
171.44  1     184.64    0
192.87  0     207.72    1
214.3   1     230.8     1
235.73  0     253.88    0
257.16  1     276.96    0
278.59  0',header=TRUE,skip=1,fill=TRUE)    
## create data for each participant
partA = data.frame(dat[,1:2],part='A')
partB = setNames(data.frame(dat[,3:4],part='B'),names(partA))
## merge the 2 frames  and order vs Time
dat.all = rbind.data.frame(partA,partB)
dat.all = dat.all[complete.cases(dat.all),]
dat.all = dat.all[order(dat.all$Time),]

You can check the result :

head(dat.all)
    Time Task part
1   0.00    1    A
15  0.00    0    B
2  21.43    0    A
16 23.08    1    B
3  42.86    1    A
17 46.16    0    B

Edit continue after OP clarifi....

Basically I am :

  1. creating 2 time series using xts package
  2. align them every k seconds
  3. merge them
  4. Replacing each NA with the most recent non-NA prior to it.

Hope it is clear, The solution is a little bit long since the use don't give the data in a handy form.

library(reshape2)
dat.all <- 
dcast(Time~part,data=dat.all,value.var="Task",fill=0)
library(xts)
k <- 10

origin <- Sys.time()
dat_xts <- 
  xts(dat.all[,c('A','B')], origin+dat.all$Time)
dat_target= xts( seq(0,300,k),index(dat_xts)[1]+ seq(0,300,k))

dat_xts = align.time(dat_xts,n=k)
dat_target = align.time(dat_target,n=k)

head(na.locf(merge(dat_xts,dat_target)))
 # A B dat_target
# 2014-03-08 13:48:40 1 0          0
# 2014-03-08 13:48:50 1 0         10
# 2014-03-08 13:49:00 0 0         20
# 2014-03-08 13:49:00 0 1         20
# 2014-03-08 13:49:10 0 1         30
# 2014-03-08 13:49:20 1 0         40
0
votes

I assume that df has two columns, first column is time, second column is task.

# generate some dummy data
df = data.frame(Time=sort(runif(100,0,300)),Task=rbinom(100,1,0.5))
xout = seq(0,300,5)
result = data.frame(approx(df$Time,df$Task,xout,method="constant",rule=2))
head(df)
head(result)

The result will look like this

> head(df)
       Time Task
1  5.158972    0
2  9.799133    1
3 14.676851    0
4 14.938065    0
5 16.774653    0
6 18.433240    1
> head(result)
   x y
1  0 0
2  5 0
3 10 1
4 15 0
5 20 1
6 25 1