14
votes

Slightly tearing my hair out with this one... I am trying to run a Docker image on Fargate in a VPC in a Public subnet. When I run this as a Task I get:

ResourceInitializationError: unable to pull secrets or registry auth: pull
command failed: : signal: killed

If I run the Task in a Private subnet, through a NAT, it works. It also works if I run it in a Public subnet of the default VPC.

I have checked through the advice here:

Aws ecs fargate ResourceInitializationError: unable to pull secrets or registry auth

In particular, I have security groups set up to allow all traffic. Also Network ACL set up to allow all traffic. I have even been quite liberal with the IAM permissions, in order to try and eliminate that as a possibility:

The task execution role has:

   {
        "Action": [
            "kms:*",
            "secretsmanager:*",
            "ssm:*",
            "s3:*",
            "ecr:*",
            "ecs:*",
            "ec2:*"
        ],
        "Resource": "*",
        "Effect": "Allow"
    }

With trust relationship to allow ecs-tasks to assume this role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs-tasks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The security group is:

sg-093e79ca793d923ab All traffic All traffic All 0.0.0.0/0

And the Network ACL is:

Inbound
Rule number Type Protocol Port range Source Allow/Deny
100 All traffic All All 0.0.0.0/0    Allow
*   All traffic All All 0.0.0.0/0    Deny

Outbound
Rule number Type Protocol Port range Destination Allow/Deny
100 All traffic All All 0.0.0.0/0    Allow
*   All traffic All All 0.0.0.0/0    Deny

I set up flow logs on the subnet, and I can see that traffic is Accept Ok in both directions.

I do not have any Interface Endpoints set up to reach AWS services without going through the Internet Gateway.

I also have Public IP address assigned to the Fargate instance upon creation.

This should work, since the Public subnet should have access to all needed services through the Internet Gateway. It also works in the default VPC or a Private subnet.

Can anyone suggest what else I should check to debug this?

4

4 Answers

14
votes

One of potential problems for ResourceInitializationError: unable to pull secrets or registry auth: pull command failed: : signal: killed is disabled Auto-assign public IP. After I enable it (recreating service from the scrath), task run properly without issues.

enter image description here

4
votes

I was facing the same issue. But in my case, I was triggering the Fargate Container from the Lambda function using the RunTask operation. So In the RunTask operation, I was not passing the below parameter:

assignPublicIp: ENABLED

After adding this, Container was triggering without any issues.

2
votes

It turns out that I did not have DNS support enabled for the VPC. Once this is enabled, it works.

I did not see DNS support explicitly mentioned in any docs for Fargate - I guess its pretty obvious or how else will it look up the various AWS services it needs. But thought it worth noting in an answer against this error message.

1
votes

AWS container runner needs to access to the container repositories, and AWS service.

If you're on a public subnet, the easiest is to "Auto-assign public IP" to have your containers access to internet, even if your app do not need egress access to internet.

Otherwise, if you're using only AWS services (ECR, and no images pulled from docker.io), then you could use VPC endpoints to access ECR/S3/Cloudwatch, and enabling DNS options on your VPC.

For private subnet, it's the same.

If you're using docker.io images, then you need egress access to internet in your subnet anyway.