How to choose multiple columns as condition for row selection

Question

For example,

set.seed(1984)
d <- data.table(name=letters[1:26],a=rbinom(26,1,0.5),b=rbinom(26,1,0.5),c=rbinom(26,1,0.5))

I can remove rows that a, b, c columns are 0 by:

d[,if(sum(a,b,c) != 0) .SD,by=.(a,b,c)]

the result is:

   a b c name
 1: 1 1 1    a
 2: 1 1 1    u
 3: 1 1 1    x
 4: 0 1 0    b
 5: 0 1 0    d
 6: 0 1 0    h
 7: 0 1 1    c
 8: 0 1 1    g
 9: 0 1 1    o
10: 0 1 1    q
11: 0 1 1    t
12: 1 1 0    e
13: 1 1 0    k
14: 1 1 0    y
15: 1 0 0    f
16: 1 0 0    i
17: 1 0 0    r
18: 1 0 0    s
19: 1 0 0    w
20: 0 0 1    j
21: 0 0 1    v
22: 1 0 1    m
23: 1 0 1    n
    a b c name

Now, I have two questions:

How to keep "name" column as the first column?
How to choose a, b, c columns as a simple expression (like a:c, but a:c is not meant a, b, c)? If there are hundreds columns, I can't type endless a, b, c ... in sum function or being the parameters of by.

Add question:

if it is not sum (has rowSums version for handling rows) but other functions like max, how to resovle question 1 and 2 without apply function family (apply function family is designed for data frame, I am afraid of they will decrease the speed of data table).

Onyambu Onyambu · Accepted Answer · 2017-09-05T12:47:10

1

votes

You could also use rowSums function:

 d[rowSums(d[,2:4])!=0,]

How to choose multiple columns as condition for row selection

Add question:

2 Answers