1
votes

I was trying to set up a ECS service running a container image on a cluster, but could not get the setup working.

I have basically followed the guide on https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-blue-green.html, except that I was trying to host the containers on EC2 instances.

I wonder if the issue is related to the network mode (used "awsvpc").

Expectation

It should show something on index.html on access witht eh ALB link

Observation

When I tried to access with the load balancer link, it gives HTTP 503, and the health-check also showed unhealthy

ALB_Link_HTTP_503

And it seems ECS keeps "re-creating" the conatiners? (Forgive me as I am still not familiar with ECS)

Containers_keep_re-creating

Tried to access the container instance directly but also could not reach

Container_instance_link

Conatiner_instance_could_not_reach

I had a look on the ECS agent log (/var/logs/ecs-agent.log) on the container instance, the image should have been pulled sucessfully

Image_pulled_successfully

And the task should have been started

enter image description here

ECS service events

It seems it kept register and deregister target

ECS_service_events

Security groups have been set to accept HTTP traffic

Setup

Tomcat server on container starts on port 80

  • ALB

enter image description here

  • Listener

enter image description here

  • Target group

enter image description here

ECS task definition creation

{
"family": "TestTaskDefinition",
"networkMode": "awsvpc",
"containerDefinitions": [
    {
        "name": "TestContainer",
        "image": "<Image URI>",
        "portMappings": [
            {
                "containerPort": 80,
                "hostPort": 80,
                "protocol": "tcp"
            }
        ],
        "essential": true
    }
],
"requiresCompatibilities": [
    "EC2"
],
"cpu": "256",
"memory": "512",
"executionRoleArn": "<ECS execution role ARN>"
}

ECS service creation

{
"cluster": "TestCluster",
"serviceName": "TestService",
"taskDefinition": "TestTaskDefinition",
"loadBalancers": [
    {
        "targetGroupArn": "<target group ARN>",
        "containerName": "TestContainer",
        "containerPort": 80
    }
],
"launchType": "EC2",
"schedulingStrategy": "REPLICA",
"deploymentController": {
    "type": "CODE_DEPLOY"
},
"networkConfiguration": {
   "awsvpcConfiguration": {
      "assignPublicIp": "DISABLED",
      "securityGroups": [ "sg-0f9b629686ca3bd08" ],
      "subnets": [ "subnet-05f47b367df4f50d4", "subnet-0fd76fc8e47ea3be7" ]
   }
},
"desiredCount": 1
}
1
Since you've disabled public assignPublicIp, how do you ensure internet connectivity to download docker images?Marcin
I think "assignPublicIp" should be disabled for EC2 launchType. You will get the error "An error occurred (InvalidParameterException) when calling the CreateService operation: Assign public IP is not supported for this launch type." if specfiy enabledPatrick C.
Did you go to ECS Service tab in console, and look at events? It can have more info on what's happening.Marcin
Supplemented the events capture for the ECS service. Seems the target group is kept being registered and deregistered.Patrick C.
Tried to create a service without using load balancer, made the following changes and it works. (1) Change nework mode from "awsvpc" to "bridge". Understand fargate launch type must use "awsvpc" network mode, but not sure if EC2 launch type must use "bridge" (2) Change the service health-check period to a larger value (e.g. 300s). It seems ECS treated the container instance unhealthy while the application has still not finished starting up, hence resulting in task kept being stopped and started. Anyway, finally got it working with ALB, as well as in a CodePipeline flow, thanks.Patrick C.

1 Answers

1
votes

Based on the comments.

To investigate the issue, it was recommended to tested the ECS service without ALB. Based on the test, it was found that the ALB was treating the ECS service as unhealthy due to long application starting time.

The issue was solved by increasing ALB health-check grace period to (e.g. 300s).

not sure if EC2 launch type must use "bridge"

You can use awsvpc on EC2 instances as well, but bridge is easier to use in this case.