2
votes

I have a Docker -> Filebeat -> Elasticsearch pipeline for logs.

I am using Elasticsearch ingest pipelines to process my logs(after Filebeat sends them).

In my logs, there is a message field and payload field. Here is the fun part:

Payload is sometimes object and sometimes string.

Now, ES will clearly not allow this type clash in mapping, so whatever comes first into my index every day goes in and the other type will be invalid and throw an error. I would like to handle this in a following manner:

  1. Check the type of payload field
  2. If it is string, nest it into payload.text field

I have no idea how to do the "nesting" and also found NO way to type-check the payload field.

Is this not supported at all? Do I have to add Logstash just for this? Or can it be solved on the Filebeat side?

EDIT: mentioned ingest pipelines

1

1 Answers

0
votes

Inside your ingest node you would create GROK filters and according with the pattern found decide where to put the value.

Remembering, if you want to force the field type you can use this structure:

%{FAVORITE_DOG:pet:int}
%{FAVORITE_DOG_text:pet:string}

By example if you want to parse a field and decide to put in two different fields according with the type. Here I am defining two different patterns:

P1(For integers): "My value is=%{NUMBER:my_variable_int:int}"
P2(For text):"My value is=%{WORD:my_variable_text:string}

ps.: %{NUMBER} and %{WORD} are common REGEX rules

ps2.: Order matters, try to put the integer first because the string will always be recognized...

Example:

log1: "My value is=10"
log2: "My value is=SOME_TEXT"


{
  "description" : "...",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["My value is=%{NUMBER:my_variable_int:int}","My value is=%{WORD:my_variable_text:string}"]
      }
    }
  ]
}

source: https://www.elastic.co/guide/en/elasticsearch/reference/master/grok-processor.html