Some Background
I am working on a problem where I have sets stored in a hashmap with keys being the set name, i.e. Set1--> a,b,c,e,g .... Set2--> a,g,h,f ... Set3--> b,c,e ... etc.
The aim of the program is take a value from the user as a "threshold" i.e. 2, which is used a the minimum commonality between sets. if the threshold is met or exceeded, the program suggests a merge between the sets.
I have created a combination creator that will generate every possible combination between set names for comparison with order not considered i.e. (Set1, Set2,),(Set1,Set3),(Set2,Set3), (Set1,Set2,Set3).
These sets of combinations are then used to actually compare the sets. If the Threshold is met, this combination is store in a seperate list to output to the user as a possible merge. Before this is outputted, these is some logic to delete child combinations i.e. if (Set1,Set2,Set3) is a possible merge, then you can disregard, the other 3 child combinations as this super combination already covers it. we then output the suggested merges.
The Problem
When we reach a certain number of sets to compare i.e. above 17 let's say, we get an out of memory issue because there are millions of combinations being created. I would like your help on understanding alternative approaches or how we could improve this approach. It works but it's not efficient enough :(
Combination Creator
/**
* Iterates through the setsToBeCompared ArrayList and gets all the combinations
*
* @return - ArrayList with all the possible combinations
*/
public ArrayList<String> generateCombinations(ArrayList<String> setsToBeCompared) {
List<List<String>> temp = new ArrayList<>();
ArrayList<String> a = new ArrayList<>();
for (int i = 2; i <= 3; i++) {
temp = calculateCombinations(setsToBeCompared, i);
for (List<String> list : temp) {
a.add(list.toString());
}
}
return a;
}
/**
* Calculates all the combination given by the parameters
*
* @param values - the names of the sets to be compared
* @param size - where to start from
* @return - List of all possible calculated combinations
*/
private List<List<String>> calculateCombinations(List<String> values, int size) {
if (0 == size) {
return Collections.singletonList(Collections.<String>emptyList());
}
if (values.isEmpty()) {
return Collections.emptyList();
}
List<List<String>> combination = new LinkedList<List<String>>();
String actual = values.iterator().next();
List<String> subSet = new LinkedList<String>(values);
subSet.remove(actual);
List<List<String>> subSetCombination = calculateCombinations(subSet, size - 1);
for (List<String> set : subSetCombination) {
List<String> newSet = new LinkedList<String>(set);
newSet.add(0, actual);
combination.add(newSet);
}
combination.addAll(calculateCombinations(subSet, size));
return combination;
}
PowerSetIterablemight be the one you need). But note that the running time still grows exponentially. This will be a problem, maybe not with 17, but with 18 or 19 sets. Otherwise, you'll have to consider smarter approaches (and therefore, describe the problem setting in more detail) - Marco13