I have generator function that creates a Cartesian product of lists. The real application uses more complex objects, but they can be represented by strings:
import itertools
s1 = ['a', 'b']
s2 = ['c', 'd', 'e', 'f']
s3 = ['c', 'd', 'e', 'f']
s4 = ['g']
p = itertools.product(*[s1,s2,s3,s4])
names = [''.join(s) for s in p]
In this example, the result is 32 combinations of characters:
names
['accg', 'acdg', 'aceg', 'acfg', 'adcg', 'addg', 'adeg', 'adfg', 'aecg',
'aedg', 'aeeg', 'aefg', 'afcg', 'afdg', 'afeg', 'affg', 'bccg', 'bcdg',
'bceg', 'bcfg', 'bdcg', 'bddg', 'bdeg', 'bdfg', 'becg', 'bedg', 'beeg',
'befg', 'bfcg', 'bfdg', 'bfeg', 'bffg']
Now, let's say I have some constraints such that certain character combinations are illegal. For example, let's say that only strings that contain the regex '[ab].c' are allowed. ('a' or 'b' followed by any letter followed by 'c')
After applying these constraints, we are left with a reduced set of just 8 strings:
import re
r = re.compile('[ab].c')
filter(r.match, names)
['accg', 'adcg', 'aecg', 'afcg', 'bccg', 'bdcg', 'becg', 'bfcg']
In the real application the chains are longer, there could be thousands of combinations and applying the hundreds of constraints is fairly computationally intensive so I'm concerned about scalability.
Right now I'm going through every single combination and checking its validity. Does an algorithm/data structure exist that could speed up this process?
EDIT: Maybe this will clarify a little: In the real application I am assembling random 2D pictures of buildings from simple basic blocks (like pillars, roof segments, windows, etc.). The constraints limit what kind of blocks (and their rotations) can be grouped together, so the resulting random image looks realistic, and not a random jumble.
A given constraint can contain many combinations of patterns. But of all those combinations, many are not valid because a different constraint would prohibit some portion of it. So in my example, one constraint would contain the full Cartesian product of characters above. And a second constraint is the '[ab].c'; this second constraint reduces the number of valid combinations of the first constraint that I need to consider.
Because these constraints are difficult to create; I looking to visualize what all the combinations of blocks in each constraint look like, but only the valid combinations that pass all constraints. Hence my question. Thanks!
s3 = ['c']
work just as well? That is, instead of filtering the output, how about limiting the inputs? – Brent Washburnes3 = ['c']
but in the real application there are more complex constraints in which the inputs can't be eliminated a priori. For example, what if there was a constraint that only two consecutive identical letters are allowed (e.g. 'cc', 'dd', 'ee, 'ff') – jasonm76s2 = s3 = ['c']
, thens2 = s3 = ['d']
and so on. My point is that you can really speed things up by being smart with the inputs instead of filtering every possible combination, knowing that many of them are unused. If all else fails, use filters. – Brent Washburne