0
votes

I'm trying to find a way in Google Sheets to get unique combinations of a given set of strings with the following restrictions:

  • given data is set of strings
  • one input set per row with up to 17 "types" (1 string per column)
  • desired output: all combinations per row with 1, 2, 3, and 4 types
  • no repetition within one input set wanted, i.e. "a,b,c" also covers "a,c,b", "b,a,c", etc., but repetitions in different rows are needed
  • input strings might contain spaces
  • (optional:) output should be sorted alphanumerically (overall or within each group of 1, 2, 3, and 4 elements, respectively)

Example sheet with given data and desired output (created manually): https://docs.google.com/spreadsheets/d/1n-RBtIGJmjVWCSBENX8OrHuqZ4PCGxCnx-gJFhZxsfw/edit?usp=sharing

Given data example: Given data example

Desired output example (highlighting just for clarification): enter image description here

I've read similar questions and solutions like this one: https://sheetswithmaxmakhrov.wordpress.com/2017/12/22/generate-all-possible-combinations-for-columns-in-google-spreadsheets/

But they usually combine data from different columns instead from within a row. The solutions differ quite a lot and use different functions (JOIN, CONCATENATE, TRIM, SEQUENCE, REPT, etc.). This does make it quite hard to understand when reading through the different questions.

1
I'm not sure about maximum number of allowed columns: due to high number of possible combinations, the table might have to get transposed.Morphan
I don't think you would run out of columns - 17C4 is only 2380 for example, while this webapps.stackexchange.com/questions/30234/… says 18278 columns are allowed. Simpler methods of generating combinations use counting in binary, but as pointed out here en.wikipedia.org/wiki/Combination this gets very inefficient for increasing n. 2^n = 131072 for n=17 for example. This makes me think you might need a script.Tom Sharpe
@TomSharpe: Thank you for the clarification regarding max number of columns. Fortunately, I don‘t need 2^n combinations, because I only need combinations of up to 4 elements. So maximum number of combinations should be 17C4 + 17C3 + 17C2 + 17C1 = 3,213. That said, I would be fine with a script, but so far I‘ve experience with formulas only.Morphan
Final formula, based on Tom Sharpe's answer: =ARRAYFORMULA(substitute(transpose(sort(transpose(mid(trim(query(transpose(if(mid(filter(base(sequence(2^COUNTA(B3:3);1;0);2;COUNTA(B3:3));(len(substitute(base(sequence(2^COUNTA(B3:3);1;0);2;COUNTA(B3:3));"0";""))=4)+(len(substitute(base(sequence(2^COUNTA(B3:3);1;0);2;COUNTA(B3:3));"0";""))=3)+(len(substitute(base(sequence(2^COUNTA(B3:3);1;0);2;COUNTA(B3:3));"0";""))=2)+(len(substitute(base(sequence(2^COUNTA(B3:3);1;0);2;COUNTA(B3:3));"0";""))=1)); sequence(1;COUNTA(B3:3));1)="1";","&B3:3;));"select *";9^9));2;99))));" ,";", "))Morphan
I have updated the example sheet with the solution.Morphan

1 Answers

2
votes

Here is a proof of concept just for 5C4:

=ARRAYFORMULA(mid(trim(query(transpose(if(mid(filter(base(sequence(2^5,1,0),2,5),
len(substitute(base(sequence(2^5,1,0),2,5),"0",""))=4),
{1,2,3,4,5},1)="1",","&B4:F4,)),"select *",9^9)),2,99))

enter image description here

Here are the steps in the formula working outwards:

enter image description here

It will be interesting to see if this scales up for 17C4.

Here is the corresponding formula:

=ARRAYFORMULA(mid(trim(query(transpose(if(mid(filter(base(sequence(2^17,1,0),2,17),
len(substitute(base(sequence(2^17,1,0),2,17),"0",""))=4),
sequence(1,17),1)="1",","&A1:Q1,)),"select *",9^9)),2,99))

It seems to work OK if you insert 2380 columns, so just needs generalising so that you have a variable instead of 17. This would have to be repeated for selection size 1, 2 and 3 in each case.