4
votes

I have following problem e.g: Given a bucket with symbols
1 1 2 3 3 4
And book of recipes to create pairs e.g:
12 13 24
Select from bucket optimal pairing, leaving as little as possible symbols in the bucket. So using the examplary values above the optimal pairing would be:
13 13 24 Which would use all the symbols given.
Naive picking from the bucket could result in something like:
12 13 Leaving the 3 and 4 unmatched. 3 and 4 cannot be matched because the book does not contain a recipe for that particular connection

Notes:
Real problem consits on average of: 500 elements in bucket in about 30 kind of symbols.

We've tried to implement the solution using the bruteforce algorithm, however I am afraid that even our grandchildren will not live long enough to see the result :).

There is no limit to the size of recipe book, it could even have every possible in the bucket. Pair made of the same element twice is not allowed.

The answer is not required to empty the bucket completely. Its just about getting the most pairs out of the bucket. Its okay to leave some in the bucket. It would be best to look for the optimal solution, however close approximation is also good enough.

I will appreciate an answer that proposes/gives hint to an algorithm to solve the problem.

Examples:
Bucket:
1 1 2 2 2 2 3 3 3 4 5 6 7 8 8
Recipe book:
12 34 15 68
Optimal result (one of possible):
{1 2} {1 2} {3 4} {6 8}
Leftover:
2 2 3 3 5 7 8

2
In the 12 13 example, what is wrong with pairing 34 ?Tim Biegeleisen
I've updated the answer. 34 connection is not available in the recipe book. And we can only connect according to pre-given recipe list.Aleksander Fular
I believe you can go quite far using linear algebra techniques, but I'm not fully versed with the details. You should start by "thinking about the issue" in terms of "how many of each pairings should I use for each symbol?". So, in the example above, the 1 symbol can be expressed as x+y=2 where 2 is the count of 1 symbols, and x & y are the number of 12 & 13 respectively. Next you'll have x+z=1, y=2, z=1 and you could obviously solve x=0 from that to get your required result. Linear algebra will come in when no solution exists and you need an approximation.Amit
@Amit I am having troubles understanding your explanation. Could you please post an examplary solution to my 2nd example in the question?Aleksander Fular
In terms of expressing your 2nd example as a set of equations, let's denote a: 12, b: 34, c: 15, d: 68, and the equations are: a+c=2 a=4 b=3 b=1 c=1 d=1 (nothing)=1 d=2. Now, obviously there's no solution to this set of equations so you need an approximation, but you can see how your answer fits as well as 1 of each pair (a=b=c=d=1).Amit

2 Answers

4
votes

This problem is essentially the maximum matching problem with the small twist that you're allowed to have duplicate objects. Here's one way to solve this problem assuming you have a solver for maximum matching:

  1. Create a node for each number in the input list.

  2. For each recipe, for each pair of numbers matching that recipe, add an edge between the nodes for those numbers.

  3. Run a maximum matching algorithm and return the pairs reported that way.

There are a good number of off-the-shelf maximum matching algorithms you can use, and if you need to code one up yourself, consider Edmonds' Blossom Algorithm, which is reasonably efficient and less tricky to code up than other approaches.

1
votes

First generate all possibles pairs of symbols and store them with the indices of each symbol , so if you have n symbols , then n*(n+1)/2 pairs are going to be generated (max case n=500 then 125250 pairs are going to be generated ).

Ex : bucket with symbols 1 1 3 Then pairs are going to be generated are (11,1,2)(13,1,3)(13,2,3). General format ( a[i]a[j], i, j ).

Now lets loop over generated pairs and delete pairs that doesn't exist in the book of recipes, so now we have at most 30 pairs .

Next lets build a graph such that the nodes are our generated pairs, and each 2 nodes are connected if the indices of the 2 pairs are different (using 2 nested loops over our pairs ) .

Finally we can perform BFS or DFS and find the longest graph between all generated graphs , which has the answer to our problem.

If you want c++/Java implementation ,please don't hesitate to ask.