Logstash grok filter custom pattern is not working

Question

I've a log file (http://codepad.org/vAMFhhR2), and I want to extract a specific number out of it (line 18) I wrote a custom pattern grok filter, tested it on http://grokdebug.herokuapp.com/, it works fine and extracts my desired value.

here's how logstash.conf looks like:

input {
    tcp {
        port => 5000
    }
}

filter {
    grok{
         match => [ "message", "(?<scraped>(?<='item_scraped_count': ).*(?=,))" ]
    }
}

output {
    elasticsearch {
        hosts => "elasticsearch:9200"
    }
}

but it doesn't match any record from the same log on Kibana

Thoughts?

What are you trying to achieve with this lookahead and lookbehind? Are you trying to discard the lines that don't match? — Antoine Cotten

Antoine Cotten Antoine Cotten · Accepted Answer · 2017-04-11T20:22:30

Your regexp may be valid but the lookahead and lookbehind ("?=" and "?<=") are not a good choice in this context. Instead you could use a much simpler filter:

match => [ "message", "'item_scraped_count': %{NUMBER:scraped}" ]

This will extract the number after 'item_scraped_count': as a field called scraped, using the 'NUMBER' Grok built-in pattern.

Result in Kibana:

{
  "_index": "logstash-2017.04.11",
  "_type": "logs",
  "_source": {
    "@timestamp": "2017-04-11T20:02:13.194Z",
    "scraped": "22",
    (...)
  }
}

If I may suggest another improvement: since your message is spread across multiple lines you could easily merge it using the multiline input codec:

input {
    tcp {
        port => 5000
        codec => multiline {
            pattern => "^(\s|{')"
            what => "previous"
        }
    }
}

This will merge all the lines starting with either a whitespace or {' with the previous one.

Logstash grok filter custom pattern is not working

1 Answers