1
votes

I am following the traefik 2.1.6 docs to enable Prometheus monitor in traefik 2.1.6 by adding:

args:
            - --configfile=/config/traefik.yaml
            - --web
            - --kubernetes
            - --logLevel=INFO
            - --metrics.prometheus=true
            - --entryPoints.metrics.address=:8080
            - --metrics.prometheus.entryPoint=metrics
            - --metrics.prometheus.addServicesLabels=true
            - --metrics.prometheus.addEntryPointsLabels=true
            - --metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000

modify my traefik deployment like this:

apiVersion: v1
kind: Service
metadata:
  name: traefik
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8080'
spec:
  ports:
    - name: web
      port: 80
    - name: websecure
      port: 443
    - name: metrics
      port: 8080
  selector:
    app: traefik
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: traefik-ingress-controller
  labels:
    app: traefik
spec:
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      name: traefik
      labels:
        app: traefik
    spec:
      serviceAccountName: traefik-ingress-controller
      terminationGracePeriodSeconds: 1
      containers:
        - image: traefik:latest
          name: traefik-ingress-lb
          ports:
            - name: web
              containerPort: 80
              hostPort: 80           #hostPort方式,将端口暴露到集群节点
            - name: websecure
              containerPort: 443
              hostPort: 443          #hostPort方式,将端口暴露到集群节点
            - name: metrics
              containerPort: 8080
          resources:
            limits:
              cpu: 2000m
              memory: 1024Mi
            requests:
              cpu: 1000m
              memory: 1024Mi
          securityContext:
            capabilities:
              drop:
                - ALL
              add:
                - NET_BIND_SERVICE
          args:
            - --configfile=/config/traefik.yaml
            - --web
            - --kubernetes
            - --logLevel=INFO
            - --metrics.prometheus=true
            - --entryPoints.metrics.address=:8080
            - --metrics.prometheus.entryPoint=metrics
            - --metrics.prometheus.addServicesLabels=true
            - --metrics.prometheus.addEntryPointsLabels=true
            - --metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
          volumeMounts:
            - mountPath: "/config"
              name: "config"
      volumes:
        - name: config
          configMap:
            name: traefik-config 
      tolerations:              #设置容忍所有污点,防止节点被设置污点
        - operator: "Exists"
      nodeSelector:             #设置node筛选器,在特定label的节点上启动
        IngressProxy: "true"

when I go to the grafana dashboard to see the data,there is nothing output.This is my dashboard define:

{
  "__inputs": [
    {
      "name": "DS_K8S-PROMETHEUS",
      "label": "k8s-prometheus",
      "description": "",
      "type": "datasource",
      "pluginId": "prometheus",
      "pluginName": "Prometheus"
    }
  ],
  "__requires": [
    {
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
      "version": "5.0.3"
    },
    {
      "type": "panel",
      "id": "grafana-piechart-panel",
      "name": "Pie Chart",
      "version": "1.3.3"
    },
    {
      "type": "panel",
      "id": "graph",
      "name": "Graph",
      "version": "5.0.0"
    },
    {
      "type": "datasource",
      "id": "prometheus",
      "name": "Prometheus",
      "version": "5.0.0"
    }
  ],
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "type": "dashboard"
      }
    ]
  },
  "description": "Traefik dashboard prometheus\n\nPrometheus监控traefik总览面板",
  "editable": true,
  "gnetId": 9682,
  "graphTooltip": 0,
  "id": null,
  "iteration": 1547710539645,
  "links": [],
  "panels": [
    {
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 8,
      "panels": [],
      "title": "全局监控",
      "type": "row"
    },
    {
      "aliasColors": {},
      "breakPoint": "50%",
      "cacheTimeout": null,
      "combine": {
        "label": "Others",
        "threshold": "0"
      },
      "datasource": "${DS_K8S-PROMETHEUS}",
      "fontSize": "80%",
      "format": "locale",
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 1
      },
      "id": 2,
      "interval": null,
      "legend": {
        "percentage": true,
        "show": true,
        "sideWidth": null,
        "values": true
      },
      "legendType": "Right side",
      "links": [],
      "maxDataPoints": 3,
      "minSpan": 23,
      "nullPointMode": "connected",
      "pieType": "pie",
      "repeat": null,
      "repeatDirection": "h",
      "strokeWidth": 1,
      "targets": [
        {
          "expr": "sum(traefik_backend_requests_total{k8scluster =~ \"^$Cluster$\", backend=~\"^$backend$\"}) by (backend) ",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{backend}}",
          "refId": "A"
        }
      ],
      "title": "访问量占比",
      "transparent": false,
      "type": "grafana-piechart-panel",
      "valueName": "total"
    },
    {
      "aliasColors": {},
      "breakPoint": "50%",
      "cacheTimeout": null,
      "combine": {
        "label": "Others",
        "threshold": 0
      },
      "datasource": "${DS_K8S-PROMETHEUS}",
      "fontSize": "80%",
      "format": "locale",
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 12,
        "y": 1
      },
      "id": 12,
      "interval": null,
      "legend": {
        "percentage": true,
        "percentageDecimals": null,
        "show": true,
        "values": true
      },
      "legendType": "Right side",
      "links": [],
      "maxDataPoints": 3,
      "nullPointMode": "connected",
      "pieType": "pie",
      "strokeWidth": 1,
      "targets": [
        {
          "expr": "sum(traefik_backend_requests_total{k8scluster =~ \"^$Cluster$\", backend=~\"^$backend$\", code != \"200\"}) by (code) ",
          "format": "time_series",
          "intervalFactor": 1,
          "legendFormat": "{{ code }}",
          "refId": "A"
        }
      ],
      "title": "非200状态码占比",
      "type": "grafana-piechart-panel",
      "valueName": "current"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_K8S-PROMETHEUS}",
      "fill": 1,
      "gridPos": {
        "h": 8,
        "w": 24,
        "x": 0,
        "y": 10
      },
      "id": 10,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": false,
        "hideEmpty": false,
        "hideZero": false,
        "max": false,
        "min": false,
        "rightSide": true,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(traefik_backend_requests_total{k8scluster=~\"^$Cluster$\", backend=~\"^$backend$\"}) by (backend)",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ backend }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeShift": null,
      "title": "总访问量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "none",
          "label": "",
          "logBase": 1,
          "max": null,
          "min": "0",
          "show": true
        },
        {
          "decimals": 0,
          "format": "none",
          "label": "",
          "logBase": 1,
          "max": null,
          "min": "0",
          "show": true
        }
      ]
    }
  ],
  "refresh": false,
  "schemaVersion": 16,
  "style": "dark",
  "tags": [
    "traefik",
    "prometheus"
  ],
  "templating": {
    "list": [
      {
        "allValue": ".*",
        "current": {},
        "datasource": "${DS_K8S-PROMETHEUS}",
        "hide": 0,
        "includeAll": true,
        "label": null,
        "multi": false,
        "name": "Cluster",
        "options": [],
        "query": "label_values(k8scluster)",
        "refresh": 1,
        "regex": ".*",
        "sort": 0,
        "tagValuesQuery": "",
        "tags": [],
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      },
      {
        "allValue": ".*",
        "current": {},
        "datasource": "${DS_K8S-PROMETHEUS}",
        "hide": 0,
        "includeAll": true,
        "label": null,
        "multi": false,
        "name": "backend",
        "options": [],
        "query": "label_values({k8scluster=\"$Cluster\"},backend)",
        "refresh": 1,
        "regex": "",
        "sort": 0,
        "tagValuesQuery": "",
        "tags": [],
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      }
    ]
  },
  "time": {
    "from": "now-5m",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "",
  "title": "Traefik-Monitor",
  "uid": "qPdAviJmz",
  "version": 22
}

I login to my Prometheus server and query traefik and find nothing:

metrics_entrypoint_requests_total{code="200"}

It looks like the Treafik not collect the metrics data.And I login into kubernetes cluster pod and get metrics of traefik,it return success.

[root@soa-room-service-8fd445cdb-42bvs /]# curl 172.30.184.11:8080/metrics|more
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.5522e-05
go_gc_duration_seconds{quantile="0.25"} 2.1096e-05
go_gc_duration_seconds{quantile="0.5"} 2.6706e-05
go_gc_duration_seconds{quantile="0.75"} 6.0421e-05
go_gc_duration_seconds{quantile="1"} 0.054839752
go_gc_duration_seconds_sum 0.066358769
go_gc_duration_seconds_count 46
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 174
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.13.8"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 1.4143352e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 2.52690104e+08
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.551946e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter

what should I do to make it work?

1

1 Answers

1
votes

The described problem is too wide. You should narrow it down to find which part is "broken"!

You can do so by following the next steps:

First go to your prometheus server and check if you can see the reported metrics there.

  • If yes - there's a problem with the way the connection to prometheus is defined in grafana

  • If no, it means that either your application doesn't report the metrics or the server doesn't collect them from the application. So first check if your application reports the metrics but trying to scrape the application-dns/metrics, the result should look something like the following

enter image description here

if you can see the metrics it means that prometheus-server doesn't scrape your application, so you should check the server's configuration. If you can't see the metrics it means that traefik doesn't collect them and you should re-check the configuration etc.