I am forwarding my docker logs via the syslog drivers to logstash. This works great for normal log lines, but having issues with multilines. The issue I am running into is that the docker log forwarding adds the syslog message format to each log line. If I use the logstash filter multiline (which logstash does not recommend), I can successfully combine the multilines and remove the syslog messages on the additional lines...however, this is not thread safe. I cannot get the logic to work via an input codec which is what logstash recommends.
So for example:
Docker command:
docker run --rm -it \
--log-driver syslog \
--log-opt syslog-address=tcp://localhost:15008 \
helloWorld:latest
Logs in docker container:
Log message A
<<ML>> Log message B
more B1
more B2
more B3
Log message C
Logs as received into logstash
<30>Jul 13 16:04:36 [1290]: Log message A
<30>Jul 13 16:04:37 [1290]: <<ML>> Log message B
<30>Jul 13 16:04:38 [1290]: more B1
<30>Jul 13 16:04:39 [1290]: more B2
<30>Jul 13 16:04:40 [1290]: more B3
<30>Jul 13 16:04:41 [1290]:Log message C
Now I can get everything to parse as I want using the following filter:
logstash filter multiline
input {
tcp {
port => 15008
type => "multiline"
}
}
filter {
if ( [type] == "multiline") {
grok {
match => { "message" => [
"^<(?<ignore>\d*)>(?<syslogDateTime>[\S]*)\s\[(?<pid>\d*)\]:.(?<newMessage>[\s\S]*)"
]}
}
multiline {
pattern => "^[\s\S]*\<\<[M][L]\>\>"
negate => true
what => "previous"
source => "newMessage"
stream_identity => "%{host}.%{pid}"
}
}
This is exactly what I want in my logstash messages
output
message: Log message A
message: <<ML>> Log message B more B1 more B2 more B3
message: Log message C
However, that runs for a few minutes...but then hangs and stops processing
Trying to get it to work via the codec multiline which is logstash recommendation
logstash codec multiline
input {
tcp {
port => 15008
type => "multiline"
codec => multiline {
pattern => "^[\s\S]*\<\<[M][L]\>\>"
negate => true
what => "previous"
}
}
}
filter {
if ( [type] == "multiline") {
grok {
match => { "message" => [
"^<(?<ignore>\d*)>(?<syslogDateTime>[\S]*)\s\[(?<pid>\d*)\]:.(?<newMessage>[\s\S]*)"
]}
}
}
It combines the multilines correctly, but I now get those syslog messages mixed into my multiline messages
output
message: Log message A
message: <<ML>> Log message B <30>Jul 13 16:04:38 [1290]: more B1 <30>Jul 13 16:04:39 [1290]: more B2 <30>Jul 13 16:04:40 [1290]: more B3
message: Log message C
How to get the codec processing to output like the filter one?
those syslog messages? Can you post an example where it is mixing up showing the mixed up output logs? - Mrunal Pagnis