I have an instance of Filebeat (version 7.5.0, running on a Windows Server) monitoring a local folder for log files, and sending this data onwards to Logstash (version 7.5.0, running in a Docker continer). In Logstash I would like to extract one of the folder names (the last) and add this as a field.
A concrete example is that from two log entries, one from the file d:\\Logs\\Foo\\Bar\\lorem\\currentlog.txt
and one from the file d:\\Logs\\Foo\\Bar\\ipsum\\currentlog.txt
, I would like to extract the values lorem
and ipsum
respectively.
For this I have the following (simplified example) set up:
input {
pipeline { address => "test" }
}
filter {
grok {
match => { "source" => ".*\\\\.*\\\\(?<product>.*)\\\\.*" }
}
}
output {
stdout { codec => rubydebug }
}
I have tested the regular expression used to find a match (named product
) on the source field in several places (both grockconstructor, grockdebug and rubular), and they all seem to yield the desired result: I get a named match for product with the exected value of the last folder in the path.
However, when I run Logstash with the above pipeline configuration it does not manage to extract the folder name and put its value in the product field. Instead I see that a tag is added to the logstash output with the value grokparsefailure
, indicating that there is something wrong with my grok expression. But all my testing in the above referenced tools indicates that there is nothing wrong with my expression...
The full logstash output looks like this:
{
"@version" => "1",
"tags" => [
[0]"beats_input_codec_plain_applied",
[1]"_grokparsefailure"
],
"host" => {
"name" => "test"
},
"message" => "Another line in the log",
"agent" => {
"id" => "e00d2f50-b10c-406a-a4fa-be381d15b869",
"ephemeral_id" => "28dfe105-b936-40de-bc97-16c4a9196e30",
"hostname" => "my-host",
"name" => "test",
"type" => "filebeat",
"version" => "7.5.0"
},
"@timestamp" => 2019 - 12 - 16T14: 04: 09.064Z,
"ecs" => {
"version" => "1.1.0"
},
"log" => {
"file" => {
"path" => "d:\\Logs\\Foo\\Bar\\ipsum\\currentlog.txt"
},
"offset" => 21
},
"input" => {
"type" => "log"
}
}
I have tried changing the match to be on the log.file.path
property, but that gives me the same _grokparsefailure
tag.
I am also pretty sure that this worked on an earlier installation of Filebeat/Logstash (perhaps one or two major versions back), but I can't remember exactly.
So the question is: Why isn't Logstash able to extract the folder name from the Filebeat source? And is there a way I can debug this grok problem any further?