0
votes

I have some fact tables in my data model with millions of rows. Each table has its own refresh cycle. Now I have added the system date into every table within Power Query as a new column to show the last refresh. I am using this function:

DateTime.LocalNow() as datetime  

Since the tables contain millions of rows I am asking which data type is here the best? I could transform the DateTime to a decimal number and then changing with FORMAT in Power BI and a measure the decimal number back to DateTime, if necessary. How would you do that? Model Run Date as a decimal number or as DateTime? Is there any difference from a memory point of view?

1
Different datatype can have different compression in memory. You can check which column in your model need at most space. Beside, o you really need add this column to main dataset? Maybe you can create new table, contain only 1 row with info. sqlbi.com/tv/…msta42a
@msta42a: Thank you so much for your support! How would you build a new table, that contains for each table the information of the last refresh time? The refresh times are different between each table.joshua
I haven't tested it for your scenario, but you can reference one "query"/"Table" in another as a source. example. I have dataset name "Salary" and some column Date, Spend, Name. I can create a new blank Query where in advanced editor i can put: let Source = Salary[Date] #"Removed Duplicates" = List.Distinct(Source) in #"Removed Duplicates" That gives me a list of unique Dates from my Salary Table. In the next step you can add column with source table name. Repeat for all table, then append queriesmsta42a
@msta42a: Awesome! Thanks for that. And how can I add the Model Run data for each table into this new table? I mean, I still need this information in the original table, right? The new table is just the extract with the model run dates. How will this new table help?joshua

1 Answers

1
votes

Power BI uses columnar compression such that the memory size of a column is mostly dependent on the number of distinct values in that column. Because of that, any column with just a single value is going to be negligible in terms of memory and the data type isn't going to make any material difference.