0
votes

I want to extract first non missing variable in my data.table for each row.

function_non_missing<-function(x){
  x<-x[!is.na(x)]
  #Then apply some other transformations such as
  #x<-x[x!=""]
  #x<-x[x!="some random thing"]
  if (length(x)>0){
    x[1]
  } else{
    NA
  }
}

Now I just want to apply this function row by row. I searched for previous answers and then tried things like:

data<-data[,non_missing_var:=function_non_missing(.SD),by=1:nrow(data)]

I also tried other permutations of the same idea but nothing seems to work. More generally can somebody point towards some tutorial to learn about the most efficient ways to apply data.table ideas (in particular how to use Map and Reduce) row by row using as arguments columns specified in .SDcols. In practice what I often want to do is something like:

data<-data[,my_new_var:=random_function(.SD),.SDcols=c("var_1","var_2","var_3"),by=1:nrow(data)]

and random_function is operating on a vector.

2

2 Answers

0
votes

Apparently this will work:

data<-data[,non_missing_var:=function_non_missing(unlist(.SD)),by=1:nrow(data)]

could somebody more familiar with data.table comment why this works and why do I need to put unlist.

0
votes

I suggest using the apply function instead. Try

apply(data, 1, function_non_missing)

1refers to applying the function row-wise.