I’m analyzing response time (RT) data from experiments. In these experiments, each person completes a certain number of trials of various trial types. RT data from only correct trials is used; therefore the amount of RTs to be analyzed per trial type per subject differs. I'm trying to create an outlier function that applies a standard deviation cutoff that is dependent on the number of trials to be analyzed (Van Selst & Jolicoeur, 1994). For example, if the first subject has 100 trials of trial type A, I want to calculate the mean and standard deviation for that subject’s A trials then apply a standard deviation cutoff (e.g., trials greater than the absolute value of the mean plus or minus the number of standard deviations indicated are scored a 0).
The standard deviation cutoffs I'd like to use are listed below:
#n = # of trials
if n < 4 then SDout=3
if n == 4 then SDout=1.458
if n == 5 then SDout =1.68
if n == 6 then SDout=1.841
if n == 7 then SDout=1.961
if n == 8 then SDout=2.050
if n == 9 then SDout=2.12
if n == 10 then SDout=2.173
if n == 11 then SDout=2.22
if n == 12 then SDout=2.246
if n == 13 then SDout=2.274
if n == 14 the SDout=2.31
if n >= 15 & if n < 20 then SDout=2.326
if n >= 20 &if n < 25 then SDout=2.391
if n >= 25 & if n < 30 then SDout=2.41
if n >= 30 & if n < 35 then SDout=2.4305
if n >= 35 & if n < 50 then SDout=2.45
if n >= 50 & if n < 100 then SDout=2.48
if n >= 100 then SDout=2.5
My data has 3 columns: id (subject identifier), ttype (trial type), and RT.
In essence what I need this function to do is: get the RT mean, SD, and number of trials for each subject for each trial type, then test the RTs against the value that results from multiplying SDout by the SD and adding that to the mean RT. Finally, I’d like the function to create a new column where outlying trials are scored 0 and “good” trials are scored 1.
One way I can think to implement this is to use nested loops with trial types being nested within subjects. However, writing this function is beyond my skill level, so I’m asking for help with creating it. If anyone has suggestions or tips, or non-loopish ways to accomplish this I’d be very appreciative.
Thanks
SDoutcalculated from a formula? If so, could you specify it or link out to it? This would be easier to write than using a sort of 'table lookup' approach each time... - dardisco