7
votes

I am wanting to have a worker node on a server I have that is behind a NAT (i.e can't expose ports publicly) I thought this wasn't a problem but it turns out to be one:

On this server behind the NAT I run:

docker swarm join --token SWMTKN-1... X.X.X.X:2377

Which in turn adds the server to the swarm. I am not sure where the "internal" IP address comes from but on traefik I then have a new server http://10.0.1.126:8080 (10.0.1.126 is definitely not the public IP) if I exec inside the traefik container:

docker exec -it 80f9cb33e24c sh

I can ping every server/node/worker in the list on traefik apart from the new one. Why?


When joining the swarm like this on the worker behind the vpn:

docker swarm join --advertise-addr=tun0 --token SWMTKN-1-... X.X.X.X:2377

I can see a new peer on my network from the manager:

$ docker network inspect traefik
...
        "Peers": [
            ...
            {
                "Name": "c2f01f1f1452",
                "IP": "12.0.0.2"
            }
        ]

where 12.0.0.2 and tun0 is the vpn interface from the manager to the server behind the NAT. Unfortunately when I then run:

$ nmap -p 2377,2376,4789,7946 12.0.0.2
Starting Nmap 7.70 ( https://nmap.org ) at 2020-05-04 11:01 EDT
Nmap scan report for 12.0.0.2
Host is up (0.017s latency).

PORT     STATE  SERVICE
2376/tcp closed docker
2377/tcp closed swarm
4789/tcp closed vxlan
7946/tcp open   unknown

I can see that the ports are closed for the docker worker which is weird?

Also if I use nmap -p 8080 10.0.1.0/24 inside the traefik container on the manager I get:

Nmap scan report for app.6ysph32io2l9q74g6g263wed3.mbnlnxusxv2wz0pa2njpqg2u1.traefik (10.0.1.62)
Host is up (0.00033s latency).

PORT     STATE SERVICE
8080/tcp open  http-proxy

on a succesfull swarm worker which has the network internal ip 10.0.1.62

but I get:

Nmap scan report for app.y7odtja923ix60fg7madydia3.jcfbe2ke7lzllbvb13dojmxzq.traefik (10.0.1.126)
Host is up (0.00065s latency).

PORT     STATE    SERVICE
8080/tcp filtered http-proxy

on the new swarm node. Why is it filtered? What am I doing wrong?

1
I'm guessing the swarm manager(s) would need a route (through the VPN tunnel) where it could reach the node. What does docker node ls say? Also note that UDP traffic (ports 4789 and 7946) needs to be allowed as well for swarm - Ionut Ticus
Also, if the remote host is connected to the swarm's network using VPN it should not matter if it's behind a NAT; as long as it can communicate with the swarm's nodes using both TCP and UDP through the VPN tunnel it should be OK. - Ionut Ticus
docker node ls shows that it is Ready Active 19.03.8 - maxisme
@IonutTicus on the node that I have added behind the vpn! - maxisme
Why are you using a public IP address (12.0.0.2) on your VPN interface? Are all docker nodes members in the VPN? - Ionut Ticus

1 Answers

0
votes

I'm adding this here as it's a bit longer.
I don't think it's enough for only the manager and the remote node to be able to communicate; nodes need to be able to communicate between themselves.

Try to configure the manager (who is connected to the VPN) to route packets to and from the remote worker through the VPN and add the needed routes on all nodes (including the remote one).

Something like:

# Manager
sysctl -w net.ipv4.ip_forward=1  # if you use systemd you might need extra steps
# Remote node
ip route add LOCAL_NODES_SUBNET via MANAGER_TUN_IP dev tun0
#Local nodes
ip route add REMOTE_NODE_TUN_IP/32 via MANAGER_IP dev eth0

If the above works correctly you need to make the routing changes above permanent.

To find the IP addresses for all your nodes run this command on the manager:

for NODE in $(docker node ls --format '{{.Hostname}}'); do echo -e "${NODE} - $(docker node inspect --format '{{.Status.Addr}}' "${NODE}")"; done