0
votes

Description:

We have a services what running on the Google Container Engine, based on the golang library go-micro and these services running fine, except random restarting during the day.

Problem:

Pods is restarting pretty often during the day. This affects to our services and core services like kube-dns or nginx-ingress. After checking of the logs, it looks like a networking problem, after this docker daemon and kubelet is restarting, and takes to restart our services. It might happen 10 times per day or 2 times per day. This is not constantly.

Details:

Version:

kubectl version                                                                          
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2",    GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-15T06:34:56Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

OS:

uname -a
Linux microservices-g1-small-25eedb64-w265 4.4.21+ #1 SMP  Thu Nov 10 02:50:15 PST 2016 x86_64 Intel(R) Xeon(R) CPU @ 2.30GHz   GenuineIntel GNU/Linux

cat /etc/lsb-release
CHROMEOS_AUSERVER=https://tools.google.com/service/update2
CHROMEOS_RELEASE_BOARD=lakitu-signed-mpkeys
CHROMEOS_RELEASE_BRANCH_NUMBER=0
CHROMEOS_RELEASE_BUILDER_PATH=lakitu-release/R56-8977.0.0
CHROMEOS_RELEASE_BUILD_NUMBER=8977
CHROMEOS_RELEASE_BUILD_TYPE=Official Build
CHROMEOS_RELEASE_CHROME_MILESTONE=56
CHROMEOS_RELEASE_DESCRIPTION=8977.0.0 (Official Build) dev-channel lakitu 
CHROMEOS_RELEASE_NAME=Chrome OS
CHROMEOS_RELEASE_PATCH_NUMBER=0
CHROMEOS_RELEASE_TRACK=dev-channel
CHROMEOS_RELEASE_VERSION=8977.0.0
DEVICETYPE=OTHER
GOOGLE_RELEASE=8977.0.0
HWID_OVERRIDE=LAKITU DOGFOOD

Golang microservice framework go-micro

I tried to check the logs for figure out what happening and what i found:

rvices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/064853:INFO:update_manager-inl.h(52)] ChromeOSPolicy::UpdateCheckAllowed: START
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/064908:WARNING:evaluation_context-inl.h(43)] Error reading Variable update_disabled: "No value set for update_disabled"
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/064932:WARNING:evaluation_context-inl.h(43)] Error reading Variable release_channel_delegated: "No value set for release_channel_delegated"
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/065015:INFO:chromeos_policy.cc(314)] Periodic check interval not satisfied, blocking until 3/10/2017 6:58:27 GMT
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/065025:INFO:update_manager-inl.h(74)] ChromeOSPolicy::UpdateCheckAllowed: END
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed!
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed!
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed!
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed!
Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 metrics_daemon[903]: [INFO:upload_service.cc(103)] Metrics disabled. Don't upload metrics samples.
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: okKubelet is unhealthy!
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:05.302107123Z" level=error msg="Force shutdown daemon"
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:17.997217   30078 helpers.go:101] Unable to get network stats from pid 27012: couldn't read network stats: failure opening /proc/27012/net/d
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.134978   30078 helpers.go:101] Unable to get network stats from pid 26236: couldn't read network stats: failure opening /proc/26236/net/d
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.135389   30078 helpers.go:101] Unable to get network stats from pid 27581: couldn't read network stats: failure opening /proc/27581/net/d
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.135801   30078 helpers.go:101] Unable to get network stats from pid 27581: couldn't read network stats: failure opening /proc/27581/net/d
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.430715   30078 prober.go:98] No ref for container "docker://4a90f704319f64738915bc353515403263a60ad04d5859174b50bb47c255db12" (social-syn
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.430740   30078 prober.go:106] Liveness probe for "social-sync-deployment-2745944389-rftmf_on-deploy-dev(80a79ba8-04b6-11e7-be05-42010
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.431064   30078 prober.go:98] No ref for container "docker://964f8ef2da5de63196f5ddfaec156f6b93fb05671be3dd7f2d90e4efb91cbd34" (heapster-v
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.431076   30078 prober.go:106] Liveness probe for "heapster-v1.2.0.1-1382115970-l9h4q_kube-system(7f0f2677-04b6-11e7-be05-42010af00129):he
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: Dload  Upload   Total   Spent    Left  Speed
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:12Z" level=info msg="stopping containerd after receiving terminated"
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.525414   30078 prober.go:98] No ref for container "docker://6fa84a9c20b7c8600048a98d06974817e85652b3b66b8c64d6390735de5bbf19" (kube-dns-4
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.525458   30078 prober.go:106] Readiness probe for "kube-dns-4101612645-bkt6z_kube-system(7f12f616-04b6-11e7-be05-42010af00129):kubedns" f
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.631190   30078 generic.go:197] GenericPLEG: Unable to retrieve pods: operation timeout: context deadline exceeded
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.646004   30078 container_manager_linux.go:625] error opening pid file /var/run/docker.pid: open /var/run/docker.pid: no such file or dire
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.893042   30078 kubelet_pods.go:710] Error listing containers: dockertools.operationTimeout{err:context.deadlineExceededError{}}
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.893091   30078 kubelet.go:1860] Failed cleaning pods: operation timeout: context deadline exceeded
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.947556   30078 logs.go:41] http: TLS handshake error from 127.0.0.1:39224: EOF
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.990182   30078 prober.go:98] No ref for container "docker://964f8ef2da5de63196f5ddfaec156f6b93fb05671be3dd7f2d90e4efb91cbd34" (heapster-v
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.990207   30078 prober.go:106] Liveness probe for "heapster-v1.2.0.1-1382115970-l9h4q_kube-system(7f0f2677-04b6-11e7-be05-42010af00129):he
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.990268   30078 prober.go:98] No ref for container "docker://4a90f704319f64738915bc353515403263a60ad04d5859174b50bb47c255db12" (social-syn
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: [1.9K blob data]
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.043529322Z" level=error msg="Stop container error: Stop container d0c295d50409a171745524d6171a845fc3d29fd6db26da3fc883653fce1e4
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.077975854Z" level=error msg="Stop container error: Stop container 4712afe5f084cf3163bef94ac21e3d63a5179190e73a8a0fa906a59630b80
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078034531Z" level=error msg="Stop container error: Stop container 1b18343beedfbe58403017fa532b85604c7ec2c96f15bd503747c19ac37f6
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078074791Z" level=error msg="Stop container error: Stop container 1fb54295ff5ecc734bf12c576880131cb98011cb98e37b5fa982bdd257b69
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078113450Z" level=error msg="Stop container error: Stop container b8e52eafa29a8b02263894b3d0d1371a92f1656fea981a6b9842c42b5d939
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078150890Z" level=error msg="Stop container error: Stop container 9b9021078f15bc3ea03770c0c135e978326f8e279e60e9663885218070026
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.990280   30078 prober.go:106] Liveness probe for "social-sync-deployment-2745944389-rftmf_on-deploy-dev(80a79ba8-04b6-11e7-be05-42010
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:19.219709   30078 eviction_manager.go:204] eviction manager: unexpected err: failed ImageStats: failed to list docker images - operation tim
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.285843   30078 logs.go:41] http: TLS handshake error from 127.0.0.1:39414: write tcp 127.0.0.1:10250->127.0.0.1:39414: write: broken pipe
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400005   30078 kubelet.go:1725] skipping pod synchronization - [container runtime is down]
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:19.400065   30078 prober.go:98] No ref for container "docker://6d63f67520d9b76446a00e1f6d81422f12f2fa93a1a9f85a656c0b49e457ba0c" (social-acc
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400079   30078 prober.go:106] Liveness probe for "social-accounts-deployment-983093656-h9frj_on-deploy-dev(8071bfd6-04b6-11e7-be05-42
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:19.400318   30078 prober.go:98] No ref for container "docker://963021c2befd5e53a61c16ba2f7c97446b4c045bbf92f723e3b899c4fb2cde21" (post-metri
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400333   30078 prober.go:106] Liveness probe for "post-metrics-deployment-556584274-z3p67_on-deploy-dev(7f9d4125-04b6-11e7-be05-42010
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:19.400476   30078 prober.go:98] No ref for container "docker://dc65f853b22eb25bdfaf1ce5bf1d0d6f48e57379caffa526f80a71b086d5247f" (notificati
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: [1.9K blob data]
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078188154Z" level=error msg="Stop container error: Stop container 8ee3de7c4dd56136b8c8a444f9b58316d190d2dad496472e233f23bf27596
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078226785Z" level=error msg="Stop container error: Stop container a9fefcd23efb7f6472b209d6e383b8050da054c3f4b1ad2c6bf531f3b1475
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078276076Z" level=error msg="Stop container error: Stop container 874fdb93aafc0a13bcbeada66f8f031cd52c01f0cec59913a49bf93917ce5
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.565783448Z" level=error msg="Stop container error: Stop container 42b9b796470a3a0a345229227cb7fa223967c56ce3b8e2765c3d9a48e963c
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.565846865Z" level=error msg="Stop container error: Stop container add6806333a7185aa4944b9bde0c9b2be973a09e59d2b80c09e98e549b180
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.565886676Z" level=error msg="Stop container error: Stop container 5631ba532f8b2a4ac262b97fabd2df07a8fe6b0202879e1347a763a5a8921
Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400485   30078 prober.go:106] Liveness probe for "notifications-deployment-3662335406-r668m_on-deploy-dev(880c38dc-0425-11e7-be05-420

At every time, when it trying to update a ChromeOS, it starts to occurs docker daemon issues, networking issues etc.

kube-proxy.log

I0310 06:53:17.392671       5 proxier.go:750] Deleting connection tracking state for service 
IP 10.3.240.10, endpoint IP 10.0.5.223
Flag --resource-container has been deprecated, This feature will be removed in a later releas
e.
I0310 06:54:12.615435       5 iptables.go:176] Could not connect to D-Bus system bus: dial un
ix /var/run/dbus/system_bus_socket: connect: no such file or directory
I0310 06:54:12.615488       5 server.go:168] setting OOM scores is unsupported in this build
I0310 06:54:12.687932       5 server.go:215] Using iptables Proxier.
I0310 06:54:12.690596       5 server.go:227] Tearing down userspace rules.
I0310 06:54:12.690844       5 healthcheck.go:119] Initializing kube-proxy health checker
I0310 06:54:12.702034       5 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to
 131072
I0310 06:54:12.702366       5 conntrack.go:66] Setting conntrack hashsize to 32768
I0310 06:54:12.702927       5 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_tim
eout_established' to 86400
I0310 06:54:12.702951       5 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_tim
eout_close_wait' to 3600
I0310 06:54:12.714134       5 proxier.go:802] Not syncing iptables until Services and Endpoin
ts have been received from master

More logs:

g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.445978    3344 docker_manager.go:1975] Need to restart pod infra container for "roles-deployment-1745993421-qxf7z_on-a
Mar 10 06:50:45 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.574227    3344 operation_executor.go:917] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/e257aff1-055d-1
Mar 10 06:50:45 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.575943    3344 docker_manager.go:1975] Need to restart pod infra container for "social-accounts-deployment-983093656-v
Mar 10 06:50:45 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.774316    3344 operation_executor.go:917] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/e2762a4c-055d-1
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:46.056277    3344 docker_manager.go:1975] Need to restart pod infra container for "tags-srv-deployment-626769860-js4h5_on
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6680]: Could not generate persistent MAC address for veth37abc82a: No such file or directory
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth37abc82a entered promiscuous mode
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 3(veth37abc82a) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 3(veth37abc82a) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth37abc82a: Gained carrier
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:46.626937    3344 conversion.go:134] failed to handle multiple devices for container. Skipping Filesystem stats
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:46.627371    3344 conversion.go:134] failed to handle multiple devices for container. Skipping Filesystem stats
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6745]: Could not generate persistent MAC address for veth07d02159: No such file or directory
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth07d02159: Gained carrier
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth07d02159 entered promiscuous mode
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 12(veth07d02159) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 12(veth07d02159) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6771]: Could not generate persistent MAC address for veth2b02253d: No such file or directory
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth2b02253d: Gained carrier
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth2b02253d entered promiscuous mode
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 23(veth2b02253d) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 23(veth2b02253d) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6796]: Could not generate persistent MAC address for veth55143c6b: No such file or directory
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth55143c6b: Gained carrier
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth55143c6b entered promiscuous mode
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 30(veth55143c6b) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 30(veth55143c6b) entered forwarding state
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6821]: Could not generate persistent MAC address for vethe38b8eee: No such file or directory
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: vethe38b8eee: Gained carrier
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: device vethe38b8eee entered promiscuous mode
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 31(vethe38b8eee) entered forwarding state
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 31(vethe38b8eee) entered forwarding state
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:47.113442    3344 docker_manager.go:2236] Determined pod ip after infra change: "roles-deployment-1745993421-qxf7z
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:47.115417    3344 kubelet.go:1816] SyncLoop (PLEG): "social-accounts-deployment-983093656-vh2xt-deploy-dev(e257aff
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 docker[3264]: time="2017-03-10T06:50:47.118506356Z" level=error msg="Handler for GET /v1.23/images/b.gcr.io-container-registry/microservice
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:47.194220    3344 provider.go:119] Refreshing cache for provider: *gcp_credentials.dockerConfigKeyProvider
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6847]: Could not generate persistent MAC address for veth2228e3ba: No such file or directory
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection.
Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).

Question: This is possible to avoid/reduce amount of restarts and solve networking issues to make our system more stable?

1

1 Answers

1
votes

This is pretty interesting. While not a solution I would recommend:

Instances that have 0.5 or fewer cores, such as shared-core machine types, are treated as having 0.5 cores, and a network throughput cap of 1 Gbit/sec. Both persistent disk write I/O and network traffic count towards the instance's network cap. Depending on your needs, ensure your instance can support any desired persistent disk throughput for your applications. For more information, see the persistent disk specifications.

  • Start more kube-dns and nginx-ingress-controller replicas so you are less affected by single node failures