3
votes

I'm playing on localhost with a DC/OS installation. While everything works fine, I can't seem to run a docker image located inside a private repo. I'm using python to communicate with chronos:

@celery.task(name='add-job', soft_time_limit=5)
def add_job(job_id):
    job_document = mongo.jobs.find_one({
        '_id': job_id
    })

    if job_document:
        worker_document = mongo.workers.find_one({
            '_id': job_document['workerId']
        })

        if worker_document:
            job = {
                'async': True,
                'name': job_document['_id'],
                'owner': '[email protected]',
                'command': "python /code/run.py",
                "disabled": False,
                "shell": True,
                "cpus": worker_document['cpus'],
                "disk": worker_document['disk'],
                "mem": worker_document['memory'],
                'schedule': 'R1//PT300S',# start now,
                "epsilon": "PT60M",
                "container": {
                    "type": "DOCKER",
                    "forcePullImage": True,
                    "image": "quay.io/username/container",
                    "network": "HOST",
                    "volumes": [{
                        "containerPath": "/images/",
                        "hostPath": "/images/",
                        "mode": "RW"
                    }]
                },
                "uris": [
                    "file:///images/docker.tar.gz"
                ]
            }
            return chronos_client.add(job)
        else:
            return 'worker not found'
    else:
        return 'job not found'

The job runs fine with a public image (alpine:latest) but it fails without any error inside the dcos installation.

The job gets executed but it fails immediately. The error log of the job inside chronos looks like this:

I1212 12:39:11.141639 25058 fetcher.cpp:498] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/61d6d037-c9f5-482b-a441-11d85554461b-S1\/root","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":false,"value":"file:\/\/\/images\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/61d6d037-c9f5-482b-a441-11d85554461b-S1\/docker\/links\/7029bbea-4c3d-439a-8720-411f6fe40eb9","user":"root"}
I1212 12:39:11.143575 25058 fetcher.cpp:409] Fetching URI 'file:///images/docker.tar.gz'
I1212 12:39:11.143587 25058 fetcher.cpp:250] Fetching directly into the sandbox directory
I1212 12:39:11.143602 25058 fetcher.cpp:187] Fetching URI 'file:///images/docker.tar.gz'
I1212 12:39:11.143612 25058 fetcher.cpp:167] Copying resource with command:cp '/images/docker.tar.gz' '/var/lib/mesos/slave/slaves/61d6d037-c9f5-482b-a441-11d85554461b-S1/docker/links/7029bbea-4c3d-439a-8720-411f6fe40eb9/docker.tar.gz'
I1212 12:39:11.146726 25058 fetcher.cpp:547] Fetched 'file:///images/docker.tar.gz' to '/var/lib/mesos/slave/slaves/61d6d037-c9f5-482b-a441-11d85554461b-S1/docker/links/7029bbea-4c3d-439a-8720-411f6fe40eb9/docker.tar.gz'

Stdout is empty. Executed directly inside marathon as an application with the same settings the authentication works and my image is downloaded & executed. Is this something that chronos does not support? It should...I mean, it has commands for docker...

Update: digging deeper into the agent logs I found this:

Failed to run 'docker -H unix:///var/run/docker.sock pull quay.io/username/container': exited with status 1; stderr='Error: Status 403 trying to pull repository username/container: "{\"error\": \"Permission Denied\"}"

I tried the archive with it's config.json file on the agent itself and it can download when triggered from the command line. I just can't seem to understand why chronos is not using it properly. I can't find any other reference on how to put my credentials other than this.

2

2 Answers

4
votes

As it turns out...the uris param is deprecated in favor of fetch. I started from scratch with a marathon config applied to chronos and watched the logs carefully when I saw this: {'message': 'Tried to add both uri (deprecated) and fetch parameters on aBPepwhG5z33e4teG', 'status': 'Bad Request'}. Then I changed my uris parameter into:

"fetch": [{
    "uri": "/images/docker.tar.gz",
    "extract": true,
    "executable": false,
    "cache": false
}]

...and it worked.

0
votes

your post looked a little like this one, which turned out to be a problem with volumes.