1
votes

I am doing a PoC on ELK and have come across an issue. I have had a look at many topics on discuss.elastic.co and StackOverflow, but none seems to have helped.

I am trying to configure multiline events via Filebeat (using S3 input) and consuming them in Logstash. The issue that I am facing is that even after setting the multiline configuration in Filebeat, I still see the lines of a stacktrace as individual events in Logstash.

Since Logstash receives the lines of the stacktrace not as a single event but as individual lines, it is leading to a _grokparsefailure at that end, which is completely understandable as FB should club those lines into the same event prior to sending them to Logstash.

Other single line events are working expectedly and I am able to visualise them on Kibana.

filebeat.yml:

filebeat.inputs:

  - type: s3
    queue_url: https://sqs.aaaaa.amazonaws.com/xxxxxxxx/zzzzzz
    visibility_timeout: 300s
    multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
    multiline.negate: true
    multiline.match: after

Logstash configuration:

input {
  beats {
    port => 5044
    host => "0.0.0.0"
  }
}

filter {
  grok {
    match => {"message" => "%{TIMESTAMP_ISO8601:timestamp} %{GREEDYDATA:logType} %{LOGLEVEL:logLevel}%{SPACE}\[%{GREEDYDATA:key1}\] \[%{GREEDYDATA:key2}\] \[%{GREEDYDATA:key3}\] \[%{GREEDYDATA:sourceIP}\] %{GREEDYDATA:message}"}
    overwrite => [ "message" ]
  } 

 date {
    match => ["timestamp", "yyyy-MM-dd HH:mm:ss,SSS"]
  }
}

Here are two sample log statements the second of which I am trying to combine into a single event:

2020-08-18 00:30:52,481 detailed_logs ERROR    [abc] [xyz] [def] [127.0.0.1] Exception raised. Trace:
2020-08-18 00:30:52,483 detailed_logs ERROR    Traceback (most recent call last):
  File "/Users/vvv/Documents/ttt.py", line 93, in get
    x = y.perform(abc)
  File "/Users/vvv/Documents/ttt.py", line 283, in operate
    raise exception
  File "/Users/vvv/Documents/ttt.py", line 169, in operate
    d["abb"] = n["xy"]
AttributeError: 'model' object has no attribute 'create1d_on'

My gut tells me that it might be possible that Filebeat S3 input may not support multiline as I was not able to find a mention of the same in the official doc, whereas its Log input counterpart clearly mentions the same. But then again, I could be wrong.

Would appreciate any nudge in the right direction.

1

1 Answers

0
votes

I am answering my own question and will preface it by saying that this, by no means, is a valid answer to the question which I had posed, but a workaround which solved my immediate requirement and is now deployed.

Since Filebeat multiline for S3 input didn't work as expected and Logstash multiline codec is not something which is strongly recommended by Elastic (the paragraph marked IMPORTANT here), I ended up flattening the stack traces across the application by creating a utility with the following approximate structure for the purpose:

dictionary = {}
counter = 0

for line in lines:
    if line and line.strip():
        dictionary[counter] = line.strip()
        counter += 1

return json.dumps(dictionary)

Wherever an exception was being dumped into the logs using traceback.format_exc(), that trace was passed as an argument to the aforementioned utility and then logged as ERROR.

Granted that it was a bit of a manual effort making an application- wide change, but now, as per the requirement, the following construct is coming as a singular event when viewed in Kibana:

{"0": "Traceback (most recent call last):", "1": "File "/Users/vvv/Documents/ttt.py", line 93, in get", "2": "x = y.perform(abc)", "3": "File "/Users/vvv/Documents/ttt.py", line 283, in operate", "4": "raise exception", "5": "File "/Users/vvv/Documents/ttt.py", line 169, in operate", "6": "d["abb"] = n["xy"]", "7": "AttributeError: 'model' object has no attribute 'create1d_on'"}

Any feedback, suggestions and recommendations are most welcome.