I'm trying to get docker-compose deployment to AWS Elastic Beanstalk working, in which the docker images are pulled from a private registry hosted by GitLab.
The strange thing is that initial deployment works perfectly; It pulls the image from the private registry and starts the containers using docker-compose, and the webpage (served by Django) is accessible through the host.
Deploying a new version using the same docker-compose and the same docker image will result in an error while pulling the docker image:
2021/03/16 09:28:34.957094 [ERROR] An error occurred during execution of command [app-deploy] - [Run Docker Container]. Stop running the command. Error: failed to run docker containers: Command /bin/sh -c docker-compose up -d failed with error exit status 1. Stderr:Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Creating network "current_default" with the default driver
Pulling redis (redis:alpine)...
Pulling mysql (mysql:5.7)...
Pulling project.dockertest(registry.gitlab.com/company/spikes/dockertest:latest)...
Get https://registry.gitlab.com/v2/company/spikes/dockertest/manifests/latest: denied: access forbidden
2021/03/16 09:28:34.957104 [INFO] Executing cleanup logic
Setup
AWS Elastic Beanstalk 64bit Amazon Linux 2/3.2
Gitlab registry credentials are stored within a S3 bucket, with the filename .dockercfg
and has the following content:
{
"auths": {
"registry.gitlab.com": {
"auth": "base64 encoded username:personal_access_token"
}
},
"HttpHeaders": {
"User-Agent": "Docker-Client/18.03.1-ce (linux)"
}
}
The repository contains a v3 Dockerrun.aws.json
file to refer to the credential file in S3:
{
"AWSEBDockerrunVersion": "3",
"Authentication": {
"bucket": "gitlab-dockercfg",
"key": ".dockercfg"
}
}
Reproduce
Setup docker-compose.yml that uses a service with a private docker image (and can be pulled with the credentials setup in the dockercfg within S3)
Create a new applicatoin that uses the docker-platform.
eb init testapplication --platform=docker --region=eu-west-1
Note: region must be the same as the S3 bucket containing the dockercfg.
Initial deployment (this will succeed)
eb create testapplication-test --branch_default --cname testapplication-test --elb-type=application --instance-types=t2.micro --min-instance=1 --max-instances=4
The initial deployment shows that the image is available and can be started:
2021/03/16 08:58:07.533988 [INFO] save docker tag command: docker tag 5812dfe24a4f redis:alpine
2021/03/16 08:58:07.533993 [INFO] save docker tag command: docker tag f8fcde8b9ae2 mysql:5.7
2021/03/16 08:58:07.533998 [INFO] save docker tag command: docker tag 1dd9b65d6a9f registry.gitlab.com/company/spikes/dockertest:latest
2021/03/16 08:58:07.534010 [INFO] Running command /bin/sh -c docker rm `docker ps -aq`
Without changing anything to the local repository and the remote docker image on the private registry, lets do a redeployment which will trigger the error:
eb deploy testapplication-test
This will fail with the following output:
...
2021-03-16 10:02:28 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2021-03-16 10:02:29 ERROR Unsuccessful command execution on instance id(s) 'i-0dc445d118ac14b80'. Aborting the operation.
2021-03-16 10:02:29 ERROR Failed to deploy application.
ERROR: ServiceError - Failed to deploy application.
And logs of the instance show (/var/log/eb-engine.log
):
Pulling redis (redis:alpine)...
Pulling mysql (mysql:5.7)...
Pulling project.dockertest (registry.gitlab.com/company/spikes/dockertest:latest)...
Get https://registry.gitlab.com/v2/company/spikes/dockertest/manifests/latest: denied: access forbidden
2021/03/16 10:02:25.902479 [INFO] Executing cleanup logic
Steps I've tried to debug or solve the issue
- Rename dockercfg to .dockercfg on S3 (somewhere mentioned on the internet as possible solution)
- Use the 'old' docker config format instead of the one generated by docker 1.7+. But later on I figured out that Amazon Linux 2-instances are compatible with the new format together with Dockerrun v3
- Having an incorrectly formatted dockercfg on S3 will cause an error deployment regarding the misformatted file (so it actually does something with the dockercfg from S3)
Documentation
I'm out of debug options, and I've no idea where to look any further to debug this problem. Perhaps someone can see what is going wrong here?