0
votes

Scenario: I have multiple tasks running DL models on the same dataset. It is becoming wasteful downloading the same dataset in each task, so looking for methods that allows to persist the downloaded data across different task runs that require the same dataset.

I explored ResourceFiles and ApplicationPackages, however as per my understanding they do not suite my requirement because of the following

  1. ResourceFiles download the data for every task run and is not persisted.
  2. ApplicationPackages have a quota limit (default 20). And they cannot be created from within the docker container.

As per docker volume capabilities, I can run my tasks with the same volume ID and the data downloaded will persist in the VM. Since Azure batch does not directly expose the "docker run" command for running the container, is there any other way to specify using volumes for the batch tasks using the python SDK?

Can we use the "container_run_options" of TaskContainerSettings to mention docker volumes?

Edit

I tried specifying volume in TaskContainerSettings, but while trying to write to the mounted path, I am getting permission denied error

PermissionError: [Errno 13] Permission denied: '/opt/docker/Gy9EKVB728YcVZgn7e2AVuuQ/00000001.jpg'
1

1 Answers

1
votes

Found a way to use docker volumes.

First: Use the "container_run_options" of TaskContainerSettings to mention docker volumes.

    task_container_settings = batch.models.TaskContainerSettings(
        image_name=image_name,
        container_run_options=f"-v {<volume_id>}:{<path>}"
    )

This will mount a volume in /mnt/docker/volumes with the name <volume_id> and be accessible in the container in the mentioned .

Second: Run task in pool scope and elevated admin privileges. Without this, you will get permission error while trying to write to the mount volume path in the container.

task = batch.models.TaskAddParameter(
                id=task_id,
                command_line=command,
                container_settings=task_container_settings,
                user_identity=batchmodels.UserIdentity(
                    auto_user=batchmodels.AutoUserSpecification(
                        scope=batchmodels.AutoUserScope.pool,
                        elevation_level=batchmodels.ElevationLevel.admin)
                )
            )

This will run the task in root privileges so that the container spun by the task has access to the mounted volume.