3
votes

I have some strings of data separated by " " that needs to be split into columns. Is there an easy way to split the data by every nth separator. For example, the first value in x tells you that the first 4 values in y correspond to the first trial. The second value in x tells you that the next 3 values in y correspond to the second trial, and so on.

x <- c("4 3 3", "3 3 3 2 3")
y <- c("110 88 77 66 55 44 33 22 33 44 11 22 11", "44 55 66 33 22 11 22 33 44 55 66 77 88 66 77 88")

The goal is something like this:

structure(list(session = 1:2, trial.1 = structure(1:2, .Label = c("110 88 77", 
"44 55 66"), class = "factor"), trial.2 = structure(c(2L, 1L), .Label = c("33 22 11", 
"66 55 44"), class = "factor"), trial.3 = structure(1:2, .Label = c("22 33 44", 
"23 33 44"), class = "factor"), trial.4 = structure(c(NA, 1L), .Label = "55 66", class = "factor"), 
    trial.5 = structure(c(NA, 1L), .Label = "77 88 66", class = "factor")), .Names = c("session", 
"trial.1", "trial.2", "trial.3", "trial.4", "trial.5"), class = "data.frame", row.names = c(NA, 
-2L))

Ideally, any extra values from y need to be dropped from the resulting data frame, and the uneven row lengths should be filled with NA's.

1
strsplit(y, " ") then use x as a selector of elements in the resultant structure, then add your own spaces back in.Ari B. Friedman
Do you mean do it manually? I should have mentioned this is a simplified version of my real data.Jose
Your structure statement seems odd to me, do you have a particular R routine that you want to run after this cleaning, if you don't I'd suggest a vastly different structure than you are suggesting.Seth
@Seth I'm planning on reshaping the data using the reshape package after and THEN splitting individual cases by " ". Any suggestions or approaches are welcome.Jose

1 Answers

3
votes

This maybe useful

dumx<-strsplit(x,' ')
dumy<-strsplit(y,' ')
dumx<-lapply(dumx,function(x)(cumsum(as.numeric(x))))
dumx<-lapply(dumx,function(x){mapply(seq,c(1,x+1)[-(length(x)+1)],x,SIMPLIFY=FALSE)})
ans<-mapply(function(x,y){lapply(x,function(w,z){z[w]},z=y)},dumx,dumy)

I will leave you to convert the resulting list to dataframe :)