2
votes

I am using Fluentd to parse the logs and store the parsed log in MongoDB.

My application is generating the following logs:

[2018-01-25 17:50:22] 192.168.10.1 GET http://localhost.com/mypage html 0 Mozilla/5.0 200 132

Fluentd is parsing the logs correctly, but not the time (I guess). Because MongoDB is not able to store the parsed contents. And it does not even reflect in the parsed logs. Below is the result of parsing:

2018-01-25 17:50:22.000000000 +0000 request.main: {"ip-address":"192.168.10.1","request-method":"GET","request-url":"http://localhost.com/mypage","format":"html","request-size":"0","user-agent":"Mozilla/5.0","response-code":"200","response-duration":"132"}

However, I don't see the time parsed here. And suspecting this behavior as, fluent-plugin-Mongo reads:

[warn]: #0 Since v0.8, invalid record detection will be removed because Mongo driver v2.x and the API spec don't provide it. You may lose invalid records, so you should not send such records to Mongo plugin

However, when using fluentular, it parses correctly. Here is my config for tail:

<source>
  @type tail
  path /home/app-logs/dev/my-app/%Y/%b/dev-main.log
  tag request.main
  time_format %Y-%m-%d %H:%M:%S 
  format /^\[(?<time>[^\]]*)\] (?<ip-address>[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*) (?<request-method>\w*) (?<request-url>[^ ]*) (?<format>[^ ]*) (?<request-size>\d*) (?<user-agent>[^ ]*) (?<response-code>\d*) (?<response-duration>\d*)$/
  pos_file /tmp/fluentd--1516882649.pos
</source>

The mongo plugin configuration is below:

<match request.*>
  @type mongo
  host 127.0.0.1
  port 27017
  user foo
  password bar
  database my-app
  collection requests
  capped
  capped_size 100m
</match>

Any help is appreciated. Thank you!

1
The time is not sent because you need to set https://docs.fluentd.org/v0.12/articles/parser_nginx#keep_time_key to true, which is by default false.Tarun Lalwani

1 Answers

2
votes

I'm passing Nginx logs to MongoDB using Fluentd but I created a custom logs format using the configuration file of Nginx. I asked Nginx to write its logs in a json format, which is easier for me to handle. I think it's a better approach when you use Fluentd. If you can change your logs format to json, maybe you can try these settings:

<source>
  @type tail
  path /path/json/server_nginx.access.log.json #...or where you placed your Apache access log
  pos_file /path2/server_nginx.access.log.json.pos # This is where you record file position
  tag nginx.access #fluentd tag!
  format json
</source>

<match **>
  @type mongo
  database logs #(required)
  collection foo #(optional; default="untagged")
  host ***.***.***.*** #(optional; default="localhost")
  port 27017 #(optional; default=27017)
  user notmyrealusername
  password notmyrealpassword
</match>

I'm not sure whether your app is nginx related but these are my nginx log format settings:

log_format logstash_json '{ "@timestamp": "$time_iso8601", '
                         '"@fields": { '
                         '"remote_addr": "$remote_addr", '
                         '"request_time": "$request_time", '
                         '"request": "$request", '
                         '"http_referrer": "$http_referer", '
                         '"http_host": "$host", '
                         '"http_user_agent": "$http_user_agent" } }';