I am trying to setup the sebp/elk docker container to run the ELK stack on my machine. Goal is to use ELK to log/parse/search through log files like access/error logs for apache as well as logging php error logs that occur during php execution (which are multiline errors with stack traces).
An example of a php error log file I try to parse is:
[03-Jun-2020 00:39:11 Europe/Berlin] PHP Stack trace:
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 1. {main}() /var/www/myserver.domain/html/index.php:0
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 2. require() /var/www/myserver.domain/html/index.php:17
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 3. require_once() /var/www/myserver.domain/html/wp-blog-header.php:16
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 4. include() /var/www/myserver.domain/html/wp-includes/template-loader.php:27
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 5. the_content() /var/www/myserver.domain/html/wp-content/themes/summer_freedom/index.php:20
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 6. apply_filters() /var/www/myserver.domain/html/wp-includes/post-template.php:79
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 7. call_user_func_array:{/var/www/myserver.domain/html/wp-includes/plugin.php:163}() /var/www/myserver.domain/html/wp-includes/plugin.php:163
[03-Jun-2020 00:39:11 Europe/Berlin] PHP 8. searchnggallerytags() /var/www/myserver.domain/html/wp-includes/plugin.php:163
I use filebeat to send the log from my local machine to my logstash container with the following filebeat.yml config:
logstash:
enabled: true
hosts:
- localhost:5044
ssl:
certificate_authorities:
- /etc/filebeat/logstash-beats.crt
timeout: 15
filebeat:
prospectors:
-
paths:
- /var/log/php/php_errors.log
document_type: php-errors
the logstash configuration that I came up with for inside the elk container so far is the following:
input {
stdin {
codec => multiline {
pattern => "^\[%{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME} (?<tzname>[a-zA-Z]+/[a-zA-Z]+)\]"
negate => true
what => "previous"
auto_flush_interval => 10
}
type => "php-errors"
}
}
filter {
if [type] == "php-errors" {
grok {
match => { "message" => "(?m)\[(?<logtime>%{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME} (?<tzname>[a-zA-Z]+/[a-zA-Z]+))\] ?%{GREEDYDATA:message}" }
overwrite => [ "message" ]
}
date {
match => [ "logtime", "dd-MMM-yyyy HH:mm:ss" ]
remove_field => [ "logtime" ]
}
}
}
output {
stdout {
codec => rubydebug
}
}
In the beginning I was not sure if the pattern would really match, so I used the grok debugger inside kibana to double check it would be correct and really match against the input in the log file.
When using this configuration inside logstash in the sebp/elk container, I can see entries in kibana, so the general transfer via filebeat works and logstash is able to match the data as well. Unfortunately I get a message inside kibana for every line in the php errors log file, although I would like to have all the lines that belong to each other are concatenated and stored as one event inside elk.
As far as I understood the grok patterns here, logstash should use the same timestamp in every line and match multiline to write all the lines in one message instead of creating several events.
So the question is, if I just use the configuration wrong, or if there is anything missing so i will get only 1 event instead of multiple ones.
Update: as requested by @leandrojmp, i updated the logstash configuration as suggested but still got the following output for every line out of the php-error.log from logstash on stdout when running on cli:
{
"host" => {
"name" => "myserver.domain"
},
"@version" => "1",
"@timestamp" => 2020-06-03T21:54:53.886Z,
"message" => "[03-Jun-2020 23:54:49 Europe/Berlin] PHP 1. {main}() /var/www/myserver.domain/html/index.php:0",
"beat" => {
"version" => "6.4.3",
"hostname" => "myserver.domain",
"name" => "myserver.domain"
},
"tags" => [
[0] "beats_input_codec_plain_applied"
],
"offset" => 15896045,
"source" => "/var/log/php/php_errors.log"
}
so it looks like the multiline matching is not working inside logstash for me.
update 2: after some more research i found out that it is not recommended to match multiline content inside logstash, since it might end up mixing different logs into one message if you send multiple logs from different machines to one logstash instance. The suggested way to go is use filebeat.yml to merge multiline messages before sending them to logstash.