Azure Log Analytics (aka OMS) uses Kusto QL. We ship our IIS logs from docker containers to Log Analytics and I intend to use the following query to parse the entries:
ContainerLog
| extend fields = split(LogEntry, ' ')
| extend appname = tostring(fields[16])
| extend path = tostring(fields[4])
Here are 2 different records which I parsed by hand:
date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken x-forwarded-for container-app
2019-11-29 17:37:49 ddd.dd.dd.ddd GET /ping.aspx - 80 - dd.dd.ddd.d Go-http-client/1.1 - 200 0 0 12 dd.dd.ddd.d OurCustomValue
2019-11-29 17:33:36 ddd.dd.dd.ddd GET /js/js_v4/jquery-functions.js v=26.35.0.0 80 7vgnwjAzOsKcUpseaPykcQ-- dd.dd.ddd.d Mozilla/5.0+(Windows+NT+10.0 +Win64 +x64 +rv:70.0)+Gecko/20100101+Firefox/70.0 https://site.domain.com/ 200 0 0 14 dd.ddd.dd.dd:55001,+dd.dd.ddd.d OurCustomValue
Problem is that the "User Agent" field could contain space which confuses the parser and split it into other fields. So all the indices for the fields listed after User-Agent are gonna be off. For instance, the last field for the 2nd record won't be at index 16 (starting from 0), but rather 19.
Is there a better way to parse these logs, say, by defining the number or types of fields?