1
votes

I'm trying to move my Python logs files into ElastiSearch using a Fluentd tail source:

<source>
  @type  forward
  @id    input1
  @label @mainstream
  port  24224
</source>

<filter **>
  @type stdout
</filter>

<source>
  @type tail
  path /fluentd/formshare/error_log
  pos_file /fluentd/error_log.pos
  tag formshare.error
  <parse>
    @type multiline
    format_firstline /\d{4}-\d{1,2}-\d{1,2}/
    format1 /(?<timestamp>[^ ]* [^ ]*) (?<level>[^\s]+:)(?<message>[\s\S]*)/
  </parse>
</source>

<label @mainstream>
  <match formshare.access.**>
    @type elasticsearch
    host 172.28.1.1
    port 9200
    logstash_format true
    logstash_prefix formshare_access
  </match>
  <match formshare.error.**>
    @type elasticsearch
    host 172.28.1.1
    port 9200
    logstash_format true
    logstash_prefix formshare_error
  </match>
  <match **>
    @type file
    @id   output1
    path         /fluentd/log/data.*.log
    symlink_path /fluentd/log/data.log
    append       true
    time_slice_format %Y%m%d
    time_slice_wait   10m
    time_format       %Y%m%dT%H%M%S%z
  </match>
</label>

I can see from starting the FluentD service that is parsing the file:

following tail of /fluentd/formshare/error_log

and the pos_file has data

/fluentd/formshare/error_log    0000000000000604        000000000098252c

But I don't get the errors in ElasticSearch. It might be parse but I am not good with Regex (got it from https://www.datadoghq.com/blog/multiline-logging-guide/)

The connection to Elastic is working. The match formshare.access.** which I use with "fluent-logger-python" works fine. Is just the tail source that does not seem to be working.

I am super new to Fluentd so I don't know if I am doing things in the correct way or if I need something else in the configuration file.

Any help is appreciated.

1

1 Answers

0
votes

After some trial and error I got it working with this conf file:

<source>
  @type  forward
  @id    input1
  @label @mainstream
  port  24224
</source>

<filter **>
  @type stdout
</filter>

<source>
  @type tail
  @label @mainstream
  @id    input2
  path /fluentd/formshare/error_log
  pos_file /fluentd/error_log.pos
  tag formshare.error
  <parse>
    @type multiline
    format_firstline /\d{4}-\d{1,2}-\d{1,2}/
    format1 /(?<time>\d{4}-\d{1,2}-\d{1,2} +\d{1,2}:\d{1,2}:\d{1,2},\d{3}) +(?<level>[A-Z]+)[ ]{1,2}\[(?<module>(.*?))\]\[(?<thread>(.*?))\] (?<messages>.*)/
    time_format %Y-%m-%d %H:%M:%S,%L
  </parse>
</source>

<label @mainstream>
  <match formshare.access.**>
    @type elasticsearch
    host 172.28.1.1
    port 9200
    logstash_format true
    logstash_prefix formshare_access
    time_key_format %Y.%m.%d
  </match>
  <match formshare.error.**>
    @type elasticsearch
    host 172.28.1.1
    port 9200
    logstash_format true
    logstash_prefix formshare_error
    time_key_format %Y.%m.%d
  </match>
  <match **>
    @type file
    @id   output1
    path         /fluentd/log/data.*.log
    symlink_path /fluentd/log/data.log
    append       true
    time_slice_format %Y%m%d
    time_slice_wait   10m
    time_format       %Y%m%dT%H%M%S%z
  </match>
</label>

Problems I had:

  • Both sources had to have the same @label and different @id
  • The regex was not correct for my Python log file.
  • Fluentd on start calculates the tail of the file but does not process it. If the file is empty and you overwrite it with one with data Fluentd process the lines.