3
votes

I am setting up pipeline to send the kubernetes pods log to elastic cluster. I have installed filebeat as deamonset (stream: stdout) in my cluster and connected output to logstash. Beats is connected with logstash without an issue, now i want logs from application namespaces not from all namespaces in cluster. can someone guide me how to filter this in beat adn also how can to see the source message from json in es?

This is my config:

data:
  kubernetes.yml: |-
    - type: docker
      containers:
        path: "/var/lib/docker/containers"
        stream: "stdout"
        ids: "*"
        multiline.pattern: '^\s'
        multiline.match: after
      fields:
         logtype: container
      multiline:
         pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
         negate: true
         match: after
      ignore_older: 1h
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
        - decode_json_fields:
            fields: ["log"]
            overwrite_keys: true
            target: ""

Output in kibana:


{
  "_index": "filebeat-6.8.4-2020.03.06",
  "_type": "doc",
  "_id": "vHkzsHABJ57Tsdxxxxx",
  "_version": 1,
  "_score": null,
  "_source": {
    "log": {
      "file": {
        "path": "/var/lib/docker/containers/aa54562be9448183d69d8d2e1953e74560309176f044aed23484ac9e3260982c/sdnksdsdlsdnfsdlfslfnsdslfnsnlnflksdnflkdsfnsdflsdfndslffndslf-json.log"
      }
    },
    "tags": [
      "beats_input_codec_plain_applied",
      "_grokparsefailure"
    ],
    "input": {
      "type": "docker"
    },
    "@version": "1",
    "prospector": {
      "type": "docker"
    },
    "beat": {
      "version": "6.8.4",
      "name": "filebeat-vtp2f",
      "hostname": "filebeat-vtp2f"
    },
    "host": {
      "name": "filebeat-vtp2f"
    },
    "offset": 5798785,
    "stream": "stdout",
    "fields": {
      "logtype": "container"
    },
    "kubernetes": {
      "node": {
        "name": "k8-test-22313607-0"
      },
      "labels": {
        "version": "v1",
        "kubernetes": {
          "io/cluster-service": "true"
        },
        "controller-revision-hash": "6b56cfcb69",
        "pod-template-generation": "1",
        "k8s-app": "fluent"
      },
      "container": {
        "name": "fluentd"
      },
      "pod": {
        "uid": "72c50b54-5ef0-11ea-83e1-26018882335d",
        "name": "fluent-4lft2"
      },
      "namespace": "fluentd"
    },
    "source": "/var/lib/docker/containers/aa54562be9448183d69d8d2e1953e74560309176f044aed23484ac9e3260982c/aa54562be9448183d69d8d2e1953e74560309176f044aed23484ac9e3260982c-json.log",
    "@timestamp": "2020-03-06T14:15:18.561Z"
  },
  "fields": {
    "@timestamp": [
      "2020-03-06T14:15:18.561Z"
    ]
  },
  "highlight": {
    "prospector.type": [
      "@kibana-highlighted-field@docker@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1583504118561
  ]
}
3

3 Answers

4
votes

If you want Filebeat to only grab logs from certain namespaces you use a condition:

filebeat.yml:

    logging.level: error
    logging.json: true
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          templates:
          - condition:
              equals:
                kubernetes.namespace: stage
            config:
              - type: container
                paths:
                 - /var/log/containers/*${data.kubernetes.container.id}.log
                multiline.pattern: '^[[:space:]]'
                multiline.negate: false
                multiline.match: after
                include_lines: ['^{']

Note this part:

          templates:
          - condition:
              equals:
                kubernetes.namespace: stage

I do run a Filebeat as a Daemonset in each Namespace. It's a bit of extra overhead but Filebeat can be finicky so that does help us work out issues in other logical environments first.

4
votes

how to drop some namespaces, i documented here: https://ezyforanykey.blogspot.com/2020/11/filebeat-exclude-kubernetes-namespace.html

example is below:

- type: container
      paths:
        - /var/log/containers/*.log
      exclude_files:
        - /var/log/containers/java.*
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
        - drop_event.when:
            or:
            - equals:
                kubernetes.namespace: "kube-system"
            - equals:
                kubernetes.namespace: "calico-system"
-1
votes

I don't know how to filter filebeat (or even if it's possible), but you can filter on fields in the output part of your logstash configuration, using conditionals:

output {
    if [kubernetes][namespace] == "fluentd" {
        ...
        Send to Elasticsearch
        ...
    } else {
        ...
    }
}

This way you can choose different actions to take on each message, depending on the value of the kubernetes.namespace field.