0
votes

I have below 2 tables, One with complete list of URLs and other table with regex representation of all URLs (nearly 100 values) with corresponding topic. I now want to create a third table which maps each url with the topic based on the regex pattern.

I figured that kusto offers 'matches regex' but it cannot be used at a row level. Ideally I want to create a function and pass URL which output the corresponding Topic

Table1:

| URL | 

Table2:

|URL Regex| Topic| 

Output:

|URL | Topic|

let me know if the below logic needs any tuning for it to work,

Query:

.create-or-alter function with findTopic(Path:string) {
toscalar(Table2
| extend TopicName=case (Path matches regex URLRegex, Topic,"Not Found")
| project Topic)
}

Table1
| extend Topic=findTopic(Path)

1

1 Answers

1
votes

Regular expressions can't be originated from a dynamic source, like another table. In Kusto, regular expressions must be string scalars.

In your case this isn't a problem, since there are about 100 different topics. You can maintain a stored function that does the URI categorization:

.create-or-alter function GetUrlTopic(Url:string)
{
    case(
        Url matches regex @"https://bing.com.*", "Search",
        Url matches regex @"https://stackoverflow.com.*", "Q&A",
        "N/A")
}

Example:

let Uris=datatable(Url:string)
[
 "https://bing.com/foo/bar",
 "https://bing.com/1/2",
 "https://microsoft.com",
 "https://stackoverflow.com/q/1",
 "https://stackoverflow.com/q/2"
];
Uris
| extend Topic=GetUrlTopic(Url)

Result:

Url Topic
https://bing.com/foo/bar Search
https://bing.com/1/2 Search
https://microsoft.com N/A
https://stackoverflow.com/q/1 Q&A
https://stackoverflow.com/q/2 Q&A