I tried to implement a monitoring solution for Openshift cluster based on Prometheus + node-exporter + grafana + cAdvisor.
I have a huge problem with cAdvisor component. I did a lot of configuration (The changes always do with volumes), but none of them work well, containter restarting every ~2min or do not collect all data metrics (processes)
example of configuration(with this config containter do not restart every 2min, but not collect processes) I know, i dont have /rootfs in volumes, but with this container work like 5s and goes down:
containers:
- image: >-
google/cadvisor@sha256:fce642268068eba88c27c666e92ed4144be6188447a23825015884741cf0e352
imagePullPolicy: IfNotPresent
name: cadvisor-new-version
ports:
- containerPort: 8080
protocol: TCP
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: '/sys/fs/cgroup/cpuacct,cpu'
name: sys
readOnly: true
- mountPath: /var/lib/docker
name: docker
readOnly: true
- mountPath: /var/run/containerd/containerd.sock
name: docker-socketd
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: cadvisor-sa
serviceAccountName: cadvisor-sa
terminationGracePeriodSeconds: 300
volumes:
- hostPath:
path: '/sys/fs/cgroup/cpu,cpuacct'
name: sys
- hostPath:
path: /var/lib/docker
name: docker
- hostPath:
path: /var/run/containerd/containerd.sock
name: docker-socketd
i use a service account in my OS project with scc-privileged.
- Openshift version - 3.6
- Docker version - 1.12
- cAdvisor version - I tried every one from v0.26.3 to newest
I found a post that the problem can be the old version od docker, can anyone confirmed this?
Maybe someone do the right configuration and implement cAdvisor on Openshift?
example of logs:
I0409 08:41:46.661453 1 manager.go:231] Version:
{KernelVersion:3.10.0-693.17.1.el7.x86_64 ContainerOsVersion:Alpine Linux v3.4 DockerVersion:1.12.6 DockerAPIVersion:1.24 CadvisorVersion:v0.28.3 CadvisorRevision:1e567c2}
E0409 08:41:50.823560 1 factory.go:340] devicemapper filesystem stats will not be reported: usage of thin_ls is disabled to preserve iops
I0409 08:41:50.825280 1 factory.go:356] Registering Docker factory
I0409 08:41:50.826394 1 factory.go:54] Registering systemd factory
I0409 08:41:50.826949 1 factory.go:86] Registering Raw factory
I0409 08:41:50.827388 1 manager.go:1178] Started watching for new ooms in manager
I0409 08:41:50.838169 1 manager.go:329] Starting recovery of all containers
W0409 08:41:56.853821 1 container.go:393] Failed to create summary reader for "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podc323db44_39a9_11e8_accd_005056800e7b.slice/docker-26db795af0fa28047f04194d8169cf0249edf2c918c583422a1404d35ed5b62c.scope": none of the resources are being tracked.
I0409 08:42:03.953261 1 manager.go:334] Recovery completed
I0409 08:42:37.874062 1 cadvisor.go:162] Starting cAdvisor version: v0.28.3-1e567c2 on port 8080
I0409 08:42:56.353574 1 fsHandler.go:135] du and find on following dirs took 1.20076874s: [ /rootfs/var/lib/docker/containers/2afa2c457a9c1769feb6ab542102521d8ad51bdeeb89581e4b7166c1c93e7522]; will not log again for this container unless duration exceeds 2s
I0409 08:42:56.453602 1 fsHandler.go:135] du and find on following dirs took 1.098795382s: [ /rootfs/var/lib/docker/containers/65e4ad3536788b289e2b9a29e8f19c66772b6f38ec10d34a2922e4ef4d67337f]; will not log again for this container unless duration exceeds 2s
I0409 08:42:56.753070 1 fsHandler.go:135] du and find on following dirs took 1.400184357s: [ /rootfs/var/lib/docker/containers/2b0aa12a43800974298a7d0353c6b142075d70776222196c92881cc7c7c1a804]; will not log again for this container unless duration exceeds 2s
I0409 08:43:00.352908 1 fsHandler.go:135] du and find on following dirs took 1.199079344s: [ /rootfs/var/lib/docker/containers/aa977c2cc6105e633369f48e2341a6363ce836cfbe8e7821af955cb0cf4d5f26]; will not log again for this container unless duration exceeds 2s