1
votes

What is the best way to vertically combine two large tables of the same structure. Each table is about 2 mln rows. Is there any performance advantage to do it in M, not in DAX?

M approach

BigTable_M  = Table.Combine( {Table1, Table2} )

DAX approach

BigTable_DAX = UNION ( 'Table1', 'Table2' )

I have a feeling that M way loads the tables two times. Separately each primary source (Table1 and Table2) and then again both tables while loading rows to BigTable_M. Is there any reason to suffer this double load for better performance later?

By this article it seems that M is faster.

https://www.sqlbi.com/articles/comparing-dax-calculated-columns-with-power-query-computed-columns/

2

2 Answers

2
votes

Doing it in the query editor, you can choose to only load the combined table into your data model while having Table1 and Table2 exist merely as staging tables. That should address your concern about loading the tables twice.

I'd expect combining them in M would result in better compression (though the difference might not be very much). Combining in M would also allow for query folding in some situations.

4
votes

Best practice would be to do it in M/Power Query first, before loading the data to the data model. You always want the data model to be quick and responsive with as little overhead in calculations. I always recommend working from the lowest level,for example, if you can do it in the source do it there, then if you can't do it there do it in Power Query, and as a last resort do it in the Dax/Power Pivot part. This works well if you are working with a database, as you let the technology designed to do the heavy lifting/shifting of data, rather then do it all in Power BI.

If you are working with files, then it would be best to do it in the Power Query part were possible, and again let the Power Pivot engine be as quick as possible.

When consulting with clients data models, both Power BI and Analysis services, most of the trouble comes from doing stuff in the data model, rather than doing it before then. For example, data type transformations, string replacement, iterative calculations, ranking etc that would be best placed to do long before it hits the model.