-
Notifications
You must be signed in to change notification settings - Fork 675
Race condition on xtables.lock with kube-proxy #3351
Description
What you expected to happen?
I expected weave to start.
What happened?
Weave started but exited immediately with a FATAL error because it can't connect to the API.
It turns out that kube-proxy did not start, therefore not creating the iptables rules to NAT traffic to one of the k8s masters.
Further digging into the issue showed that kube-proxy and weave are in a race condition for the /run/xtables.lock file.
If weave wins the race, docker mounts the (non existing file) as a directory, thus making it useless to kube-proxy when it tries to mount it using the "FileOrCreate" type in it's manifest.
In that failure mode, weave cannot start because kube-proxy is not running, and kube-proxy can't run because the weave-net container has created a /run/xtables.lock directory before it (kube-proxy) had a chance to lock the file.
The kubelet ought to have touched the file (/run/xtables.lock), but that seems unreliable.
How to reproduce it?
Since this is a race condition between kube-proxy and weave-net, reproducing the issue may be unreliable itself.
This only shows up when the kubelet has not touched the /run/xtables.lock file, and the weave-net container starts before the kube-proxy container.
We see it happening with some regularity because we use a node autoscaler on the cluster and new nodes sometimes don't come up properly when the above occurs.
Anything else we need to know?
We are running k8s 1.9.3 on AWS deployed with kops on CoreOS (1745.7.0 at the time of creating the issue).
I am working on a PR that would apply the same type to the xtables.lock volume as that defined in kube-proxy on k8s >= 1.8 (FileOrCreate) which should make this more reliable.
Versions:
$ weave version
weave 2.3.0
$ docker version
Client:
Version: 18.03.1-ce
API version: 1.37
Go version: go1.9.4
Git commit: 9ee9f40
Built: Thu Apr 26 04:27:49 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm
Server:
Engine:
Version: 18.03.1-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.4
Git commit: 9ee9f40
Built: Thu Apr 26 04:27:49 2018
OS/Arch: linux/amd64
Experimental: false
$ uname -a
Linux ip-10-137-26-22.ec2.internal 4.14.48-coreos-r2 #1 SMP Thu Jun 14 08:23:03 UTC 2018 x86_64 Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz GenuineIntel GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-18T14:14:00Z", GoVersion:"go1.9.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T11:55:20Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Logs:
$ docker logs weave
or, if using Kubernetes:
$ kubectl logs -n kube-system <weave-net-pod> weave
FATA: 2018/07/20 19:29:04.744235 [kube-peers] Could not get peers: Get https://100.64.0.1:443/api/v1/nodes: dial tcp 100.64.0.1:443: i/o timeout
Failed to get peers
Network:
$ ip route
$ ip -4 -o addr
$ sudo iptables-save