
I've a log file (http://codepad.org/vAMFhhR2), and I want to extract a specific number out of it (line 18) I wrote a custom pattern grok filter, tested it on http://grokdebug.herokuapp.com/, it works fine and extracts my desired value.

here's how logstash.conf looks like:

input {
    tcp {
        port => 5000

filter {
         match => [ "message", "(?<scraped>(?<='item_scraped_count': ).*(?=,))" ]

output {
    elasticsearch {
        hosts => "elasticsearch:9200"

but it doesn't match any record from the same log on Kibana


What are you trying to achieve with this lookahead and lookbehind? Are you trying to discard the lines that don't match?Antoine Cotten

1 Answers


Your regexp may be valid but the lookahead and lookbehind ("?=" and "?<=") are not a good choice in this context. Instead you could use a much simpler filter:

match => [ "message", "'item_scraped_count': %{NUMBER:scraped}" ]

This will extract the number after 'item_scraped_count': as a field called scraped, using the 'NUMBER' Grok built-in pattern.

Result in Kibana:

  "_index": "logstash-2017.04.11",
  "_type": "logs",
  "_source": {
    "@timestamp": "2017-04-11T20:02:13.194Z",
    "scraped": "22",

If I may suggest another improvement: since your message is spread across multiple lines you could easily merge it using the multiline input codec:

input {
    tcp {
        port => 5000
        codec => multiline {
            pattern => "^(\s|{')"
            what => "previous"

This will merge all the lines starting with either a whitespace or {' with the previous one.