6
votes

I'm running a service on a Swarm cluster, thanks to docker stack deploy --with-registry-auth and this compose file:

version: "3.1"
services:
  builder-consumer:
    image: us.gcr.io/my-gcloud-project/my/image:123
    stop_grace_period: 30m
    volumes:
      - [...]
    environment:
      - [...]
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == worker
    secrets:
      - [...]
secrets:
  [...]

This works fine when I deploy, but when I add a worker node to the swarm later on, the new worker can't pull the image required to run the task. The system logs report this:

level=error msg="Not continuing with pull after error: denied: Permission denied for \123\" from request \"/v2/my-gcloud-project/my/image/manifests/123\". "

level=info msg="Translating \"denied: Permission denied for \\"123\\" from request \\"/v2/my-gcloud-project/my/image/manifests/123\\". \" to \"repository us.gcr.io/my-gcloud-project/my/image not found: does not exist or no pull access\""

level=error msg="pulling image failed" error="repository us.gcr.io/my-gcloud-project/my/image not found: does not exist or no pull access" module="node/agent/taskmanager" node.id=... service.id=... task.id=...

level=error msg="fatal task error" error="No such image: us.gcr.io/my-gcloud-project/my/image:123@sha256:..." module="node/agent/taskmanager" node.id=... service.id=... task.id=...

However, when I manually run docker pull on that machine, it works fine, since every machine in the cluster is authenticated to my private Google Registry, thanks to docker login.

Thus my questions are:

  • Why can't the added worker pull from the private registry?
  • What does --with-registry-auth do exactly?

Thanks a lot

Note: the nodes are running Ubuntu 16.04.2 LTS and the Docker version is:

Server:
 Version:      17.04.0-ce
 API version:  1.28 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   4845c56
 Built:        Mon Apr  3 18:07:42 2017
 OS/Arch:      linux/amd64
 Experimental: false
1
It looks like you've encountered a swarm mode issue. Have you checked their issues on GitHub and/or raised your own issue? - BMitch
@bouga - did you solve this somehow? I have the same problem and looking for some workaround. - Izydorr
@Izydorr Nope, we simply moved away from swarm and use Kubernetes instead - BOUGA

1 Answers

2
votes

In my case I was not running the stack with "--with-registry-auth", so I shuted down the instances, and I started again the manager with that option, and now it works