0
votes

For example,

set.seed(1984)
d <- data.table(name=letters[1:26],a=rbinom(26,1,0.5),b=rbinom(26,1,0.5),c=rbinom(26,1,0.5))

I can remove rows that a, b, c columns are 0 by:

d[,if(sum(a,b,c) != 0) .SD,by=.(a,b,c)]

the result is:

   a b c name
 1: 1 1 1    a
 2: 1 1 1    u
 3: 1 1 1    x
 4: 0 1 0    b
 5: 0 1 0    d
 6: 0 1 0    h
 7: 0 1 1    c
 8: 0 1 1    g
 9: 0 1 1    o
10: 0 1 1    q
11: 0 1 1    t
12: 1 1 0    e
13: 1 1 0    k
14: 1 1 0    y
15: 1 0 0    f
16: 1 0 0    i
17: 1 0 0    r
18: 1 0 0    s
19: 1 0 0    w
20: 0 0 1    j
21: 0 0 1    v
22: 1 0 1    m
23: 1 0 1    n
    a b c name

Now, I have two questions:

  1. How to keep "name" column as the first column?
  2. How to choose a, b, c columns as a simple expression (like a:c, but a:c is not meant a, b, c)? If there are hundreds columns, I can't type endless a, b, c ... in sum function or being the parameters of by.

Add question:

if it is not sum (has rowSums version for handling rows) but other functions like max, how to resovle question 1 and 2 without apply function family (apply function family is designed for data frame, I am afraid of they will decrease the speed of data table).

2
Like d[d[, rowSums(.SD) != 0, .SDcols = a:c]] ? - talat

2 Answers

1
votes

You could also use rowSums function:

 d[rowSums(d[,2:4])!=0,]
2
votes

We could use Reduce with + to create a logical vector based on the columns specified in the .SDcols

d[d[, Reduce(`+`, .SD) != 0, .SDcols = a:c]]

Other options include (@nicola's)

d[Reduce("+",d[,a:c])!=0]

Or as suggested by @Frank using pmax to create a column ('keep') based on the maximum value on on each row, convert it to logical from binary and based on that subset the rows and columns

d[, keep := as.logical(do.call(pmax, .SD)), .SDcols=!"name"][(keep), !"keep"]