2
votes

I have a situation where I would like to detect conditions between two logical, named vectors based on the TRUE / FALSE combination at each position in the vector. For example:

x <- c(TRUE, FALSE, FALSE, TRUE)
names(x) <- c("a", "b", "c", "d")
y <- c(TRUE, TRUE, FALSE, FALSE)
names(y) <- names(x)

For each element in these two vectors I want to detect 3 conditions:

  • x[i] is TRUE and y[i] is TRUE;
  • x[i] is FALSE and y[i] is TRUE,
  • x[i] is TRUE and y[i] is FALSE.

The length of x and y are the same but could be longer than this example. I want to retrieve the name of the element for each condition and assign the element name to a new variable. For this example:

v1 <- "a"
v2 <- "b"
v3 <- "d"

In a longer version of these two vectors I might end up with something like:

v1 <- c("a", "e")
v2 <- c("b", "f", "g")
v3 <- c("d", "i", "k", "l")

What is the best vectorized way to do this. I think it is simple but I am unable to come up with the answer. Thanks in advance.

1

1 Answers

3
votes

We can efficiently use split, but before that, we need a single grouping index. Here is a possibility:

g <- x + y + x
split(names(x), g)

To understand the above grouping index, consider this:

x <- c(TRUE, TRUE, FALSE, FALSE)
y <- c(TRUE, FALSE, TRUE, FALSE)
x + y + x
#[1] 3 2 1 0

So you can see that 4 combinations of TRUE and FALSE are mapped to 4 integer values.

Ah, so "a" get assigned to T-T, "b" to T-F, etc. But, why the x + y + x?? I don't follow adding x twice.

If you only do x + y, the result is only 0, 1 and 2. You won't be able to differentiate T-F and F-T as they are both 1.


@thelatemail offers a more readable way:

split(names(x), interaction(x, y, drop=TRUE))

Update

Ah... stupid me... Why did I bother creating g. I suddenly remember that we can pass a list to f argument in split:

split(names(x), list(x, y))

Note, internally in split.default:

if (is.list(f)) 
    f <- interaction(f, drop = drop, sep = sep)