Could you help me with the job performance? I runned it with 10 AUs. And at first part of time they are used almost all. But from the second half of the execution time it uses only 1 AU. I see in the plan a one supervertex consists from only one vertex, it looks like underestimated execution plan (it is just assumption).
I'm trying to analyze the execution time but it is difficult without technical description of operations like HashCombine, HashCross, ...
So my question could i do something with it (modify code, add hints, etc)?
The problem was fixed with Mychael Rys's solution.
I applied Michael Rys's solution and it works perfect. Thank you as always! See the pic below. Almost all 10AUs from 10AUs are used now. Also i played with modeling tool and looks like the script is scaled near to linearly. Awesome :).
One more solution
Also I can replace inner joins by left joins (the replacement will be equivalent in my case because in dimension tables ALWAYS exists only one row for any record in a fact table dim-1:M-fact). The CBO estimate join’s results cardinality as “at least not less than the fact table”. In the case CBO generate good plan without hints.