0
votes

Hoping someone can help.. I am new to Kusto and have to get an external table reading data from an Azure Blob storage account working, but the one table I have is unique in that the data for the timestamp column is split into 2 separate columns , i.e. LogDate and LogTime (see script below).

My data is stored in the following structure in the Azure Storage account container (container is named "employeedata", for example): {employeename}/{year}/{month}/{day}/{hour}/{minute}.csv, in a simple CSV format.

I know the CSV is good because if I import it into a normal Kusto table, it works perfectly.

My KQL script for the external table creation looks as follows:

.create-or-alter external table EmpLogs (Employee: string, LogDate: datetime, LogTime:timestamp) 
kind=blob 
partition by (EmployeeName:string = Employee, yyyy:datetime = startofday(LogDate), MM:datetime = startofday(LogDate), dd:datetime = startofday(LogDate), HH:datetime = todatetime(LogTime), mm:datetime = todatetime(LogTime))
pathformat = (EmployeeName "/" datetime_pattern("yyyy", yyyy) "/" datetime_pattern("MM", MM) "/" datetime_pattern("dd", dd) "/" substring(HH, 0, 2) "/" substring(mm, 3, 2) ".csv")
dataformat=csv 
( 
    h@'************************' 
) 
with (folder="EmployeeInfo", includeHeaders="All")

I am getting the error below constantly, which is not very helpful (redacted from full error, basically comes down to the fact there is a syntax error somewhere):

Syntax error: Query could not be parsed: { "error": { "code": "BadRequest_SyntaxError", "message": "Request is invalid and cannot be executed.", "@type": "Kusto.Data.Exceptions.SyntaxException", "@message": "Syntax error: Query could not be parsed: . Query: '.create-or-alter external table ........

I know the todatetime() function works on timespan's, I tested it with another table and it created a date similar to the following: 0001-01-01 20:18:00.0000000.

I have tried using the bin() function on the timestamp/LogTime columns, but the same error as above, and even tried importing the time value as a string and doing some string manipulation on it, no luck. Getting the same syntax error.

Any help/guidance would be greatly appreciated.

Thank you!!

1

1 Answers

0
votes

Currently, there's no way to define an external table partition based on more than one column. If your dataset timestamp is splitted between two columns: LogDate:datetime and LogTime:timestamp, then the best you can do is use virtual column for the partition by time:

.create-or-alter external table EmpLogs(Employee: string, LogDate:datetime, LogTime:timespan) 
kind=blob 
partition by (EmployeeName:string = Employee, PartitionDate:datetime)
pathformat = (EmployeeName "/" datetime_pattern("yyyy/MM/dd/HH/mm", PartitionDate))
dataformat=csv 
( 
    //h@'************************'
) 
with (folder="EmployeeInfo", includeHeaders="All")

Now, you can filter by the virtual column and fine tune using LogTime:

external_table("EmpLogs")
| where Employee in ("John Doe", ...)
| where PartitionDate between(datetime(2020-01-01 10:00:00) .. datetime(2020-01-01 11:00:00))
| where LogTime ...