0
votes

I'm trying to count each ocurrence of "name" by "headsection" and "day"

Let's say I have the following table structure (a small snippet):

Timestamp Headsection Name
01/01/2021 1 A
01/01/2021 2 AA
01/01/2021 3 AAA
01/01/2021 1 B
01/01/2021 2 BB
01/01/2021 3 BBB
01/01/2021 1 C
01/01/2021 2 CC
01/01/2021 3 CCC
01/01/2021 1 D
01/01/2021 2 CC
01/01/2021 3 DDD
01/01/2021 1 E
01/01/2021 2 DD
01/01/2021 3 EEE
01/01/2021 1 A
01/01/2021 2 EE
01/01/2021 3 DDD

In essence, I want to count the top 5 names by each day by headsection

So with 3 headsections - Each day in a year should contain 15 rows

For visualising, I want to summarize the table as such:

Timestamp Headsection Name Name_count
01/01/2021 1 A 2
01/01/2021 1 B 1
01/01/2021 1 C 1
01/01/2021 1 D 1
01/01/2021 1 E 1
01/01/2021 2 CC 2
01/01/2021 2 AA 1
01/01/2021 2 BB 1
01/01/2021 2 DD 1
01/01/2021 2 EE 1
01/01/2021 3 DDD 2
01/01/2021 3 AAA 1
01/01/2021 3 BBB 1
01/01/2021 3 CCC 1
01/01/2021 3 EEE 1

I've set the query to

|where timestamp between (startofday(datetime(2021-01-01)) .. endofday(now()))

Which means that the query should be able to turn an input table to the output table for each day up until now.

In example, the following 15 rows should be 01/02/2021 (January 2nd), with top 5 "names" that day by headsection.

I'm almost new to KQL, so I could really need some help!

I've tried experimented with top-nested and the summarize operator, but I can't seem to make it work.

2
I don't understand how the input you provided leads to the output you expect. Please explain (or if the output doesn't match the input, please update them). Also, please don't use the same "string" everywhere, as it makes it harder to understand your query's logic. Once done, please ping me by adding a comment here.Slavik N
Hey! Thanks for looking at my question! I've updated the table and added some explanation - Hope it suffices! If it doesn't, then comment here, and I'll do my best to explainThreeFiberBars
Thanks for updating the question, it's much clearer now. I've posted an answer. Please let me know if you need any clarifications. P.S. Going forward, please post sample input data in datatable format (like I did below), so that it would be easy to copy-paste and start working on the query, instead of spending time typing your data, thanks! :)Slavik N

2 Answers

0
votes

This should do the trick:

let NumItemsByDayAndHeadsection = 5;
datatable(Timestamp:datetime, Headsection:long, Name:string) [
    datetime(2021-01-01), 1, "A",
    datetime(2021-01-01), 2, "AA",
    datetime(2021-01-01), 3, "AAA",
    datetime(2021-01-01), 1, "B",
    datetime(2021-01-01), 2, "BB",
    datetime(2021-01-01), 3, "BBB",
    datetime(2021-01-01), 1, "C",
    datetime(2021-01-01), 2, "CC",
    datetime(2021-01-01), 3, "CCC",
    datetime(2021-01-01), 1, "D",
    datetime(2021-01-01), 2, "DD",
    datetime(2021-01-01), 3, "DDD",
    datetime(2021-01-01), 1, "E",
    datetime(2021-01-01), 2, "EE",
    datetime(2021-01-01), 3, "EEE",
    datetime(2021-01-01), 1, "A",
    datetime(2021-01-01), 2, "EE",
    datetime(2021-01-01), 3, "DDD"
]
| summarize NameCount = count() by Timestamp, Headsection, Name
| order by Headsection asc, NameCount desc
| summarize make_list(Timestamp, NumItemsByDayAndHeadsection), make_list(Name, NumItemsByDayAndHeadsection), make_list(NameCount, NumItemsByDayAndHeadsection) by Timestamp, Headsection
| mv-expand list_Timestamp, list_Name, list_NameCount
| project Timestamp, Headsection, Name = list_Name, NameCount = list_NameCount

Output:

Timestamp Headsection Name NameCount
2021-01-01 00:00:00.0000000 1 A 2
2021-01-01 00:00:00.0000000 1 B 1
2021-01-01 00:00:00.0000000 1 C 1
2021-01-01 00:00:00.0000000 1 D 1
2021-01-01 00:00:00.0000000 1 E 1
2021-01-01 00:00:00.0000000 2 EE 2
2021-01-01 00:00:00.0000000 2 AA 1
2021-01-01 00:00:00.0000000 2 BB 1
2021-01-01 00:00:00.0000000 2 CC 1
2021-01-01 00:00:00.0000000 2 DD 1
2021-01-01 00:00:00.0000000 3 DDD 2
2021-01-01 00:00:00.0000000 3 AAA 1
2021-01-01 00:00:00.0000000 3 BBB 1
2021-01-01 00:00:00.0000000 3 CCC 1
2021-01-01 00:00:00.0000000 3 EEE 1
0
votes

Another altenrative is using the top-nested operator was created specifically for this.

let NumItemsByDayAndHeadsection = 5;
datatable(Timestamp: datetime, Headsection: long, Name: string) [
    datetime(2021-01-01), 1, "A",
    datetime(2021-01-01), 2, "AA",
    datetime(2021-01-01), 3, "AAA",
    datetime(2021-01-01), 1, "B",
    datetime(2021-01-01), 2, "BB",
    datetime(2021-01-01), 3, "BBB",
    datetime(2021-01-01), 1, "C",
    datetime(2021-01-01), 2, "CC",
    datetime(2021-01-01), 3, "CCC",
    datetime(2021-01-01), 1, "D",
    datetime(2021-01-01), 2, "DD",
    datetime(2021-01-01), 3, "DDD",
    datetime(2021-01-01), 1, "E",
    datetime(2021-01-01), 2, "EE",
    datetime(2021-01-01), 3, "EEE",
    datetime(2021-01-01), 1, "A",
    datetime(2021-01-01), 2, "EE",
    datetime(2021-01-01), 3, "DDD"
]
| top-nested of Timestamp by Dummy=max(1), 
top-nested of Headsection by Dummy2=max(1), 
top-nested 5 of Name by Count=count()
| project Timestamp, Headsection, Name, Count