103
votes

In docker, files created inside containers tend to have unpredictable ownership while inspecting them from the host. The owner of the files on a volume is root (uid 0) by default, but as soon as non-root user accounts are involved in the container and writing to the file system, owners become more or less random from the host perspective.

It is a problem when you need to access volume data from the host using the same user account which is calling the docker commands.

Typical workarounds are

  • forcing users uIDs at creation time in Dockerfiles (non portable)
  • passing the UID of the host user to the docker run command as an environment variable and then running some chown commands on the volumes in an entrypoint script.

Both these solutions can give some control over the actual permissions outside the container.

I expected user namespaces to be the final solution to this problem. I have run some tests with the recently released version 1.10 and --userns-remap set to my desktop account. However, I am not sure that it can make file ownership on mounted volumes easier to deal with, I am afraid that it could actually be the opposite.

Suppose I start this basic container

docker run -ti -v /data debian:jessie /bin/bash
echo 'hello' > /data/test.txt
exit

And then inspect the content from the host :

ls -lh /var/lib/docker/100000.100000/volumes/<some-id>/_data/

-rw-r--r-- 1 100000 100000 6 Feb  8 19:43 test.txt

This number '100000' is a sub-UID of my host user, but since it does not correspond to my user's UID, I still can't edit test.txt without privileges. This sub-user does not seem to have any affinity with my actual regular user outside of docker. It's not mapped back.

The workarounds mentioned earlier in this post which consisted of aligning UIDs between the host and the container do not work anymore due to the UID->sub-UID mapping that occurs in the namespace.

Then, is there a way to run docker with user namespace enabled (for improved security), while still making it possible for the host user running docker to own the files generated on volumes?

3
I think that if you are going to be sharing volumes between the host and the container that user namespaces are not going to be part of the solution. Your second option ("passing the UID of the host user to the docker run command as an environment variable and then running some chown commands on the volumes in an entrypoint script") is probably the best solution.larsks
Docker itself does not seem to encourage using host-mounted writable volumes. Since I am not running a cloud service and only using my own trusted images, I am now wondering if the security benefit of user NS is worth sacrifying so much convenience.Stéphane C.
@StéphaneC. have you found a better approach perhaps?EightyEight
Unfortunately no, not using user namespace and passing UIDs from the host is still my option of choice. I hope there will be a proper way to map users in the future. I doubt it but still, I keep an eye open.Stéphane C.

3 Answers

50
votes

If you can prearrange users and groups in advance, then it's possible to assign UIDs and GIDs in such specific way so that host users correspond to namespaced users inside containers.

Here's an example (Ubuntu 14.04, Docker 1.10):

  1. Create some users with fixed numeric IDs:

    useradd -u 5000 ns1
    
    groupadd -g 500000 ns1-root
    groupadd -g 501000 ns1-user1
    
    useradd -u 500000 -g ns1-root ns1-root
    useradd -u 501000 -g ns1-user1 ns1-user1 -m
    
  2. Manually edit auto-generated subordinate ID ranges in /etc/subuid and /etc/subgid files:

    ns1:500000:65536
    

    (note there are no records for ns1-root and ns1-user1 due to MAX_UID and MAX_GID limits in /etc/login.defs)

  3. Enable user namespaces in /etc/default/docker:

    DOCKER_OPTS="--userns-remap=ns1"
    

    Restart daemon service docker restart, ensure /var/lib/docker/500000.500000 directory is created.

    Now, inside containers you have root and user1, and on the host -- ns1-root and ns1-user1, with matching IDs

    UPDATE: to guarantee that non-root users have fixed IDs in containers (e.g. user1 1000:1000), create them explicitly during image build.

Test-drive:

  1. Prepare a volume directory

    mkdir /vol1
    chown ns1-root:ns1-root /vol1
    
  2. Try it from a container

    docker run --rm -ti -v /vol1:/vol1 busybox sh
    echo "Hello from container" > /vol1/file
    exit
    
  3. Try from the host

    passwd ns1-root
    login ns1-root
    cat /vol1/file
    echo "can write" >> /vol1/file
    

Not portable and looks like a hack, but works.

4
votes

One workaround is to dynamically assign user's uid on build time to match the host.

Example Dockerfile:

FROM ubuntu
# Defines argument which can be passed during build time.
ARG UID=1000
# Create a user with given UID.
RUN useradd -d /home/ubuntu -ms /bin/bash -g root -G sudo -u $UID ubuntu
# Switch to ubuntu user by default.
USER ubuntu
# Check the current uid of the user.
RUN id
# ...

Then build as:

docker build --build-arg UID=$UID -t mycontainer .

and run as:

docker run mycontainer

If you've existing container, create a wrapper container with the following Dockerfile:

FROM someexistingcontainer
ARG UID=1000
USER root
# This assumes you've the existing user ubuntu.
RUN usermod -u $UID ubuntu
USER ubuntu

This can be wrapped in docker-compose.yml like:

version: '3.4'
services:
  myservice:
    command: id
    image: myservice
    build:
      context: .
    volumes:
    - /data:/data:rw

Then build and run as:

docker-compose build --build-arg UID=$UID myservice; docker-compose run myservice
0
votes

You can avoid permission problems by using the docker cp command.

Ownership is set to the user and primary group at the destination. For example, files copied to a container are created with UID:GID of the root user. Files copied to the local machine are created with the UID:GID of the user which invoked the docker cp command.

Here is your example switched to use docker cp:

$ docker run -ti -v /data debian:jessie /bin/bash
root@e33bb735a70f:/# echo 'hello' > /data/test.txt
root@e33bb735a70f:/# exit
exit
$ docker volume ls
DRIVER              VOLUME NAME
local               f073d0e001fb8a95ad8d919a5680e72b21a457f62a40d671b63c62ae0827bf93
$ sudo ls -l /var/lib/docker/100000.100000/volumes/f073d0e001fb8a95ad8d919a5680e72b21a457f62a40d671b63c62ae0827bf93/_data
total 4
-rw-r--r-- 1 100000 100000 6 Oct  6 10:34 test.txt
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS                          PORTS               NAMES
e33bb735a70f        debian:jessie       "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       determined_hypatia
$ docker cp determined_hypatia:/data/test.txt .
$ ls -l test.txt 
-rw-r--r-- 1 don don 6 Oct  6 10:34 test.txt
$ cat test.txt
hello
$ 

However, if you just want to read files out of a container, you don't need the named volume. This example uses a named container instead of a named volume:

$ docker run -ti --name sandbox1 debian:jessie /bin/bash
root@93d098233cf3:/# echo 'howdy' > /tmp/test.txt
root@93d098233cf3:/# exit
exit
$ docker cp sandbox1:/tmp/test.txt .
$ ls -l test.txt
-rw-r--r-- 1 don don 6 Oct  6 10:52 test.txt
$ cat test.txt
howdy
$ 

I find named volumes useful when I want to copy files into a container, as described in this question.