In my project, I have a data structure represented as follows:
type 'a Tree =
| Leaf of 'a
| Node of 'a Tree array
Due to the cost of traversing through large trees, I have to parallelize some following functions on this data structure:
- map
- fold
- exists (exist a node satisfying a predicate)
- reduce / transform (optional)
Because of nature of the problem I'm working on, number of branches at each node is varied and workloads at the leaf level are quite small. The first question is what options should I consider for parallel execution on tree. I'm trying to use functions from Array.Parallel
module on every node, however, because overheads of parallelism is too big, the parallel version is even slower than the sequential version. I may change array representation to List
or PSeq
if it is necessary.
The second question is how to control degree of parallelism on those functions. I'm thinking about controlling by depth of tree, number of branches at each node, workload complexity at the leaf level and number of leaves on tree, however, combining them together seems to be complex and unpredictable.