1
votes

I need to join on a table to return the MAX value from that right-hand table. I have tried to mock it up using 'datatable' but have failed miserably :(. I'll try and describe with words.

T1 = datatable(ID:int, Properties:string, ConfigTime:datetime) [1,'a,b,c','2021-03-04 00:00:00']

T2 = datatable(ID:int, Properties:string, ConfigTime:datetime) [2,'a,b,c','2021-03-02 00:00:00', 3,'a,b','2021-03-01 00:00:00', 4,'c','2021-03-20 00:00:00']

I'm using this as an update policy on T2, which has a source of T1. So I want to select the rows from T1 and then join the rows from T2 that have the highest timestamp. My first attempt was below:

T1 | join kind=inner T2 on Id
   | summarize arg_max(ConfigTime1, Id, Properties, Properties1, ConfigTime) by Id
   | project Id, Properties, ConfigTime

In my actual update policy, I merge the properties from T1 and T2 then write to T2, but for simplicity, I've left that for now.

Currently, I'm not getting any output in my T2 from the update policy. Any guidance on another way I should be doing this would be appreciated. Thanks

2

2 Answers

0
votes

It seems that you want to push the arg_max calculation into the T2 side of the join, something like this:

T1 
| join kind=inner (
   T2 
   | summarize arg_max(ConfigTime1, Id, Properties, Properties1, ConfigTime) by Id
   | project Id, Properties, ConfigTime
) on Id

Note that to ensure acceptable performance you want to limit the timeframe for the arg_max search, so you should consider a time based filter before the arg_max.

0
votes

I think what you're looking for is a union

let T1 = datatable(ID:int, Properties:string, ConfigTime:datetime) [
    1,'a,b,c','2021-03-04 00:00:00'
];
let T2 = datatable(ID:int, Properties:string, ConfigTime:datetime) [
    2,'a,b,c','2021-03-02 00:00:00', 
    3,'a,b','2021-03-01 00:00:00', 
    4,'c','2021-03-20 00:00:00'
];

Here is an example using a variable with summarize max:

let Latest = toscalar(T2 | summarize max(ConfigTime));
T1
| union (T2 | where ConfigTime == Latest)

The result will keep the entries from T1 and only the latest entries from T2.

results output

If this doesn't reflect your expected results please show your expected output.