Trying to capture the timestamp in this log event (for Splunk)
172.21.201.135 | http | o@1I0BTOx1063x3667295x0 | hkv | 2020-06-10 17:43:18,951 | "POST /rest/build-status/latest/commits/stats HTTP/1.1" | "http://bitbucket.my.com/projects/WF/repos/klp-libs/compare/commits" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36" | 200 | 345 | 431 | - | 5 | 3dk4qm |
Using the setting TIME_PREFIX, Splunk software uses the specified regular expression to looks for a match before attempting to extract a timestamp.
TIME_PREFIX = <regular expression>
Default behaviour would be for Splunk to try to get the timestamp from the start of the line, but that is an IP-adress, therefore the need for the regex to match four pipes which is the ...time_prefix.
By using the following regex
(?:[^\|]*(\|)){4}
I want the regex to match on the fourth occurence of the '|', and then stop, non-greedy I guess.
^(?:[^|]*\|){4}(?<value>[^|]*)
, I believe. See regex demo. Or.^(?:[^|]*\|){4}\s*(?<value>[^|]*[^|\s])
– Wiktor Stribiżew^(?:[^|]*\|){4}\s*
will do. – Wiktor Stribiżew