To get a sense of how combn
orders its output, let's look at the output of combn(1:5, 3)
:
combn(1:5, 3)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 1 1 1 1 1 2 2 2 3
# [2,] 2 2 2 3 3 4 3 3 4 4
# [3,] 3 4 5 4 5 5 4 5 5 5
There is a lot of structure here. First, all columns are ordered as you go downward, and the first row is non-decreasing. The columns starting with 1 have combn(2:5, 2)
below them; the columns starting with 2 have combn(3:5, 2)
below them; and so on.
Let's now think of how to construct column number 8. The approach I would take to reconstruct would be to determine the first element of that column (due to the relationship above there are choose(4, 2)=6
columns starting with 1, choose(3, 2)=3
columns starting with 2, and choose(2, 2)=1
column starting with 3). In our case we determine that we start with 2, since columns 7-9 must start with 2.
To determine the second and subsequent elements of the column, we repeat the process with a smaller number of elements (since 2 is our first element, we're now selecting from elements 3-5), a new position (we're selecting column number 8-6=2 that begins with a 2), and a new number of remaining elements to select (we need 3-1=2 more elements).
getcombn
below is an iterative formulation that does just this:
getcombn <- function(x, m, pos) {
combo <- rep(NA, m)
start <- 1
for (i in seq_len(m-1)) {
end.pos <- cumsum(choose((length(x)-start):(m-i), m-i))
selection <- which.max(end.pos >= pos)
start <- start + selection
combo[i] <- x[start - 1]
pos <- pos - c(0, end.pos)[selection]
}
combo[m] <- x[start + pos - 1]
combo
}
chosencombn <- function(x, m, all.pos) {
sapply(all.pos, function(pos) getcombn(x, m, pos))
}
chosencombn(c("a", "b", "c", "d"), 2, c(1,4,6))
# [,1] [,2] [,3]
# [1,] "a" "b" "c"
# [2,] "b" "c" "d"
chosencombn(c("a", "b", "c", "d"), 2, c(4,5))
# [,1] [,2]
# [1,] "b" "b"
# [2,] "c" "d"
This enables you to compute particular columns in cases where it would be impossible to enumerate all the combinations (you would run out of memory). For instance, with 50 options, the number of ways to select 25 elements is a 14-digit number, so enumerating all combinations is probably not an option. Still, you can compute specific indicated combinations:
chosencombn(1:50, 25, c(1, 1000000L, 1e14))