0
votes

We are trying to parse logs generated by some of our services running in AKS Clusters. We are using EFK stack with versions:

Elasticsearch: 7.4.2, FluentD: 1.7.1, Kibana: 7.4.2

We are able to see logs in Kibana Dashboard when we are using below configuration (json) in FluentD -

<source>
  @type tail
  @id in_tail_container_logs
  path /var/log/containers/*namespace*.log,
  pos_file /var/log/fluentd-containers.log.pos
  enable_stat_watcher false
  tag kubernetes.*
  read_from_head true
  <parse>
    @type json
    time_format %Y-%m-%dT%H:%M:%S.%NZ
  </parse>
</source>

We need to parse logs using regex (regexp) and have created one.

Sample Logs -

20-02-2020 08:31:42.931 [http-nio-8080-exec-1, abcd1234abcd, abcd1234abcd] INFO com.org.proj.az.controller.ModuleController.retrieveModulePortfolio - | PROJ-Module Microservice~retrieveModulePortfolio~GET~null~null~BusinessKeys[ProspectNumber:00000123456]~ Request Received for Portfolio with prospectNumber |

Regexp used -

^(?<date>[^ ][^~][^abc]*)\s\[(?<threadcollection>[a-z\0-9;:,.]*)\]\s(?<log_level>\w+)\s(?<class_name>[^|]+)\|+\s(?<app_name>[^~]*)\~(?<app_operation>\w+)\~(?<http_operation>\w+)\~(?<transaction_id>[0-9a-z]+)\~(?<message_id>[0-9a-zA-Z]+)\~(?<business_keys>[^~]*)\~(?<log_message>[a-zA-Z\s:;,."']+)\|+

Our configuration in FluentD for using regexp to parse logs -

    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*namespace.log,
      pos_file /var/log/fluentd-containers.log.pos
      enable_stat_watcher false
      tag kubernetes.*
      read_from_head true
      <parse>
        @type regexp
        expression /^(?<date>[^ ][^~][^abc]*)\s\[(?<threadcollection>[a-z\0-9;:,.]*)\]\s(?<log_level>\w+)\s(?<class_name>[^|]+)\|+\s(?<app_name>[^~]*)\~(?<app_operation>\w+)\~(?<http_operation>\w+)\~(?<transaction_id>[0-9a-z]+)\~(?<message_id>[0-9a-zA-Z]+)\~(?<business_keys>[^~]*)\~(?<log_message>[a-zA-Z\s:;,."']+)\|+/
        time_format %Y-%m-%dT%H:%M:%S.%NZ
        time_key date
        keep_time_key true
      </parse>
    </source>

When using this method (regexp), no logs are being matched and Kibana Dashboard is empty.

1
Try your regex here: rubular.com.Azeem
@Azeem thanks for your suggestion. I have been using regex101.com and regexr.comAditya Jalkhare
Right. Please add the errors logs as well where it fails.Azeem

1 Answers

1
votes

According to the docs, you should only use the pattern itself, without the regex delimiters.

Also, if you plan to match digits, you should not escape 0 in the character class, but you must have mean to match anything in between square brackets there, so you need [^\]\[]*.

If you do not care what is after the last needed capturing group, use .*, no need using \|+.

Use

@type regexp
expression ^(?<date>[^ ][^~][^abc]*)\s\[(?<threadcollection>[^\]\[]*)\]\s(?<log_level>\w+)\s(?<class_name>[^|]+)\|+\s(?<app_name>[^~]*)\~(?<app_operation>\w+)\~(?<http_operation>\w+)\~(?<transaction_id>[0-9a-z]+)\~(?<message_id>[0-9a-zA-Z]+)\~(?<business_keys>[^~]*)\~(?<log_message>[a-zA-Z\s:;,."']+).*

See the regex demo.