Generating all Possible Combinations

68

votes

Given 2 arrays Array1 = {a,b,c...n} and Array2 = {10,20,15....x} how can I generate all possible combination as Strings a(i) b(j) c(k) n(p) where

1 <= i <= 10,  1 <= j <= 20 , 1 <= k <= 15,  .... 1 <= p <= x

Such as:

a1 b1 c1 .... n1  
a1 b1 c1..... n2  
......  
......  
a10 b20 c15 nx (last combination)

So in all total number of combination = product of elements of array2 = (10 X 20 X 15 X ..X x)

Similar to a Cartesian product, in which the second array defines the upper limit for each element in first array.

Example with fixed numbers,

    Array x =  [a,b,c]
    Array y =  [3,2,4]

So we will have 3*2*4 = 24 combinations. Results should be:

    a1 b1 c1  
    a1 b1 c2  
    a1 b1 c3  
    a1 b1 c4  

    a1 b2 c1  
    a1 b2 c2  
    a1 b2 c3  
    a1 b2 c4


    a2 b1 c1  
    a2 b1 c2  
    a2 b1 c3  
    a2 b1 c4  

    a2 b2 c1  
    a2 b2 c2  
    a2 b2 c3  
    a2 b2 c4


    a3 b1 c1  
    a3 b1 c2  
    a3 b1 c3  
    a3 b1 c4  

    a3 b2 c1  
    a3 b2 c2  
    a3 b2 c3  
    a3 b2 c4 (last)

c#combinatoricscartesian-product

Can you give a better example, using fewer elements, and produce the full results? For instance, one question I have is whether each element of the first array should be paired only with the corresponding element of the second array, or if you want to combine it with all elements of the second array. - Lasse V. Karlsen

Probably the size of the arrays are same. - Gulshan

yes 2 arrays are of same size.. - Amitd

Eric blogged about this just for you :) blogs.msdn.com/b/ericlippert/archive/2010/06/28/… - Galilyou

22

votes

using System;
using System.Text;

public static string[] GenerateCombinations(string[] Array1, int[] Array2)
{
    if(Array1 == null) throw new ArgumentNullException("Array1");
    if(Array2 == null) throw new ArgumentNullException("Array2");
    if(Array1.Length != Array2.Length)
        throw new ArgumentException("Must be the same size as Array1.", "Array2");

    if(Array1.Length == 0)
        return new string[0];

    int outputSize = 1;
    var current = new int[Array1.Length];
    for(int i = 0; i < current.Length; ++i)
    {
        if(Array2[i] < 1)
            throw new ArgumentException("Contains invalid values.", "Array2");
        if(Array1[i] == null)
            throw new ArgumentException("Contains null values.", "Array1");
        outputSize *= Array2[i];
        current[i] = 1;
    }

    var result = new string[outputSize];
    for(int i = 0; i < outputSize; ++i)
    {
        var sb = new StringBuilder();
        for(int j = 0; j < current.Length; ++j)
        {
            sb.Append(Array1[j]);
            sb.Append(current[j].ToString());
            if(j != current.Length - 1)
                sb.Append(' ');
        }
        result[i] = sb.ToString();
        int incrementIndex = current.Length - 1;
        while(incrementIndex >= 0 && current[incrementIndex] == Array2[incrementIndex])
        {
                current[incrementIndex] = 1;
                --incrementIndex;
        }
        if(incrementIndex >= 0)
            ++current[incrementIndex];
    }
    return result;
}

159

votes

Sure thing. It is a bit tricky to do this with LINQ but certainly possible using only the standard query operators.

UPDATE: This is the subject of my blog on Monday June 28th 2010; thanks for the great question. Also, a commenter on my blog noted that there is an even more elegant query than the one I gave. I'll update the code here to use it.

The tricky part is to make the Cartesian product of arbitrarily many sequences. "Zipping" in the letters is trivial compared to that. You should study this to make sure that you understand how it works. Each part is simple enough but the way they are combined together takes some getting used to:

static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
    IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>()};
    return sequences.Aggregate(
        emptyProduct,
        (accumulator, sequence) => 
            from accseq in accumulator 
            from item in sequence 
            select accseq.Concat(new[] {item})                          
        );
 }

To explain how this works, first understand what the "accumulate" operation is doing. The simplest accumulate operation is "add everything in this sequence together". The way you do that is: start with zero. For each item in the sequence, the current value of the accumulator is equal to the sum of the item and previous value of the accumulator. We're doing the same thing, except that instead of accumulating the sum based on the sum so far and the current item, we're accumulating the Cartesian product as we go.

The way we're going to do that is to take advantage of the fact that we already have an operator in LINQ that computes the Cartesian product of two things:

from x in xs
from y in ys
do something with each possible (x, y)

By repeatedly taking the Cartesian product of the accumulator with the next item in the input sequence and doing a little pasting together of the results, we can generate the Cartesian product as we go.

So think about the value of the accumulator. For illustrative purposes I'm going to show the value of the accumulator as the results of the sequence operators it contains. That is not what the accumulator actually contains. What the accumulator actually contains is the operators that produce these results. The whole operation here just builds up a massive tree of sequence operators, the result of which is the Cartesian product. But the final Cartesian product itself is not actually computed until the query is executed. For illustrative purposes I'll show what the results are at each stage of the way but remember, this actually contains the operators that produce those results.

Suppose we are taking the Cartesian product of the sequence of sequences {{1, 2}, {3, 4}, {5, 6}}. The accumulator starts off as a sequence containing one empty sequence: { { } }

On the first accumulation, accumulator is { { } } and item is {1, 2}. We do this:

from accseq in accumulator
from item in sequence 
select accseq.Concat(new[] {item})

So we are taking the Cartesian product of { { } } with {1, 2}, and for each pair, we concatenate: We have the pair ({ }, 1), so we concatenate { } and {1} to get {1}. We have the pair ({ }, 2}), so we concatenate { } and {2} to get {2}. Therefore we have {{1}, {2}} as the result.

So on the second accumulation, accumulator is {{1}, {2}} and item is {3, 4}. Again, we compute the Cartesian product of these two sequences to get:

 {({1}, 3), ({1}, 4), ({2}, 3), ({2}, 4)}

and then from those items, concatenate the second one onto the first. So the result is the sequence {{1, 3}, {1, 4}, {2, 3}, {2, 4}}, which is what we want.

Now we accumulate again. We take the Cartesian product of the accumulator with {5, 6} to get

 {({ 1, 3}, 5), ({1, 3}, 6), ({1, 4}, 5), ...

and then concatenate the second item onto the first to get:

{{1, 3, 5}, {1, 3, 6}, {1, 4, 5}, {1, 4, 6} ... }

and we're done. We've accumulated the Cartesian product.

Now that we have a utility function that can take the Cartesian product of arbitrarily many sequences, the rest is easy by comparison:

var arr1 = new[] {"a", "b", "c"};
var arr2 = new[] { 3, 2, 4 };
var result = from cpLine in CartesianProduct(
                 from count in arr2 select Enumerable.Range(1, count)) 
             select cpLine.Zip(arr1, (x1, x2) => x2 + x1);

And now we have a sequence of sequences of strings, one sequence of strings per line:

foreach (var line in result)
{
    foreach (var s in line)
        Console.Write(s);
    Console.WriteLine();
}

Easy peasy!

13

votes

Alternative solution:

Step one: read my series of articles on how to generate all strings which match a context sensitive grammar:

http://blogs.msdn.com/b/ericlippert/archive/tags/grammars/

Step two: define a grammar that generates the language you want. For example, you could define the grammar:

S: a A b B c C
A: 1 | 2 | 3
B: 1 | 2
C: 1 | 2 | 3 | 4

Clearly you can easily generate that grammar definition string from your two arrays. Then feed that into the code which generates all strings in a given grammar, and you're done; you'll get all the possibilities. (Not necessesarily in the order you want them in, mind you.)

3

votes

For comparison, here is a way to do it with Python

from itertools import product
X=["a", "b", "c"]
Y=[3, 4, 2]
terms = (["%s%s"%(x,i+1) for i in range(y)] for x,y in zip(X,Y))
for item in product(*terms):
    print " ".join(item)

3

votes

Fon another solution not linq based you can use:

public class CartesianProduct<T>
    {
        int[] lengths;
        T[][] arrays;
        public CartesianProduct(params  T[][] arrays)
        {
            lengths = arrays.Select(k => k.Length).ToArray();
            if (lengths.Any(l => l == 0))
                throw new ArgumentException("Zero lenght array unhandled.");
            this.arrays = arrays;
        }
        public IEnumerable<T[]> Get()
        {
            int[] walk = new int[arrays.Length];
            int x = 0;
            yield return walk.Select(k => arrays[x++][k]).ToArray();
            while (Next(walk))
            {
                x = 0;
                yield return walk.Select(k => arrays[x++][k]).ToArray();
            }

        }
        private bool Next(int[] walk)
        {
            int whoIncrement = 0;
            while (whoIncrement < walk.Length)
            {
                if (walk[whoIncrement] < lengths[whoIncrement] - 1)
                {
                    walk[whoIncrement]++;
                    return true;
                }
                else
                {
                    walk[whoIncrement] = 0;
                    whoIncrement++;
                }
            }
            return false;
        }
    }

You can find an example on how to use it here.

2

votes

I'm not willing to give you the complete source code. So here's the idea behind.

You can generate the elements the following way:

I assume A=(a1, a2, ..., an) and B=(b1, b2, ..., bn) (so A and B each hold n elements).

Then do it recursively! Write a method that takes an A and a B and does your stuff:

If A and B each contain just one element (called an resp. bn), just iterate from 1 to bn and concatenate an to your iterating variable.

If A and B each contain more then one element, grab the first elements (a1 resp b1), iterate from 1 to bn and do for each iteration step:

call the method recursively with the subfields of A and B starting at the second element, i.e. A'=(a2, a3, ..., an) resp B'=(b2, b3, ..., bn). For every element generated by the recursive call, concatenate a1, the iterating variable and the generated element from the recursive call.

Here you can find an analouge example of how to generate things in C#, you "just" have to adapt it to your needs.

1

votes

If I am getting it right, you are after something like Cartesian product. If this is the case here is you how you can do this using LINQ. Might not be exact answer but try to get the idea

    char[] Array1 = { 'a', 'b', 'c' };
    string[] Array2 = { "10", "20", "15" };

    var result = from i in Array1
                 from j in Array2
                   select i + j;

These Articles might help

SelectMany
How to Use LINQ SelectMany

1

votes

The finalResult is the desired array. Assumed the both arrays are of same size.

char[] Array1 = { 'a', 'b', 'c' };
int[] Array2 = { 3, 2, 4 };

var finalResult = new List<string>();
finalResult.Add(String.Empty);
for(int i=0; i<Array1.Length; i++)
{
    var tmp = from a in finalResult
              from b in Enumerable.Range(1,Array2[i])
              select String.Format("{0} {1}{2}",a,Array1[i],b).Trim();
    finalResult = tmp.ToList();
}

I think this will suffice.

1

votes

Fon another solution not linq based, more effective:

static IEnumerable<T[]> CartesianProduct<T>(T[][] arrays) {
    int[] lengths;
    lengths = arrays.Select(a => a.Length).ToArray();
    int Len = arrays.Length;
    int[] inds = new int[Len];
    int Len1 = Len - 1;
    while (inds[0] != lengths[0]) {
        T[] res = new T[Len];
        for (int i = 0; i != Len; i++) {
            res[i] = arrays[i][inds[i]];
        }
        yield return res;
        int j = Len1;
        inds[j]++;
        while (j > 0 && inds[j] == lengths[j]) {
            inds[j--] = 0;
            inds[j]++;
        }
    }
}

1

votes

Using Enumerable.Append, which was added in .NET Framework 4.7.1, @EricLippert's answer can be implemented without allocating a new array at each iteration:

public static IEnumerable<IEnumerable<T>> CartesianProduct<T>
    (this IEnumerable<IEnumerable<T>> enumerables)
{
    IEnumerable<IEnumerable<T>> Seed() { yield return Enumerable.Empty<T>(); }

    return enumerables.Aggregate(Seed(), (accumulator, enumerable)
        => accumulator.SelectMany(x => enumerable.Select(x.Append)));
}

0

votes

Here's is a javascript version, which I'm sure someone can convert. It has been tested thoroughly.

Here's the fiddle.

function combinations (Asource){

    var combos = [];
    var temp = [];

    var picker = function (arr, temp_string, collect) {
        if (temp_string.length) {
           collect.push(temp_string);
        }

        for (var i=0; i<arr.length; i++) {
            var arrcopy = arr.slice(0, arr.length);
            var elem = arrcopy.splice(i, 1);

            if (arrcopy.length > 0) {
                picker(arrcopy, temp_string.concat(elem), collect);
            } else {
                collect.push(temp_string.concat(elem));
            }   
        }   
    }

    picker(Asource, temp, combos);

    return combos;

}

var todo = ["a", "b", "c", "d"]; // 5 in this set
var resultingCombos = combinations (todo);
console.log(resultingCombos);

0

votes

How about that?

template<typename T>
std::vector<std::vector<T>> combinationOfVector(std::vector<T> vector,int nOfElem)
{
   std::vector<int> boleamRepresentationOfCombinations(vector.size(),0);
   for(auto it = boleamRepresentationOfCombinations.end()-nOfElem;it!=boleamRepresentationOfCombinations.end();it++)
   {
       *it=1;
   }
//   std::vector<std::vector<T>> toReturn(static_cast<int>(boost::math::binomial_coefficient<double>(vector.size()-1, nOfElem)));
//doenst work properly :c
    std::vector<std::vector<T>> toReturn;

   if(!std::is_sorted(boleamRepresentationOfCombinations.begin(),boleamRepresentationOfCombinations.end()))
       std::sort(boleamRepresentationOfCombinations.begin(),boleamRepresentationOfCombinations.end());
   do{
       std::vector<T> combination;
       combination.reserve(nOfElem);
       for(int i=0;i<boleamRepresentationOfCombinations.size();i++)
       {
            if(boleamRepresentationOfCombinations[i])
                combination.push_back(vector[i]);
       }
       toReturn.push_back(std::move(combination));
   }while(std::next_permutation(boleamRepresentationOfCombinations.begin(),boleamRepresentationOfCombinations.end()));
   return toReturn;
}

0

votes

If anyone is interested in industrial, tested, and supported implementation of Cartesian Product algorithm, you are welcome to use a ready-to-use Gapotchenko.FX.Math.Combinatorics NuGet package.

It provides two modes of operation. A fluent mode which is LINQ-based:

using Gapotchenko.FX.Math.Combinatorics;
using System;

foreach (var i in new[] { "1", "2" }.CrossJoin(new[] { "A", "B", "C" }))
    Console.WriteLine(string.Join(" ", i));

And an explicit mode, which is more verbose:

using Gapotchenko.FX.Math.Combinatorics;
using System;

var seq1 = new[] { "1", "2" };
var seq2 = new[] { "A", "B", "C" };

foreach (var i in CartesianProduct.Of(seq1, seq2))
    Console.WriteLine(string.Join(" ", i));

Both modes produce the same result:

1 A
2 A
1 B
2 B
1 C
2 C

But it goes further than this. For example, a projection to ValueTuple results is a simple one-liner:

var results = new[] { 1, 2 }.CrossJoin(new[] { "A", "B" }, ValueTuple.Create);

foreach (var (a, b) in results)
  Console.WriteLine("{0} {1}", a, b);

The uniqueness of the results can be achieved in a natural way:

var results = new[] { 1, 1, 2 }.CrossJoin(new[] { "A", "B", "A" }).Distinct();

From the first sight, such an approach would produce an excessive waste of combinations. So instead of doing

new[] { 1, 1, 2 }.CrossJoin(new[] { "A", "B", "A" }).Distinct()

it might be more beneficial to Distinct() the sequences before performing the expensive multiplication:

new[] { 1, 1, 2 }.Distinct().CrossJoin(new[] { "A", "B", "A" }.Distinct())

The package provides an automatic plan builder that optimizes away such idiosyncrasies. As a result, both approaches have identical computational complexity.

The corresponding source code of the package is a bit larger than a snippet can contain, but is available at GitHub.

Generating all Possible Combinations

13 Answers