20
votes

I am trying to setup aws ecs fargate deployment configuration. I was able to run containers without container health check. But, I want to run container health checks too. I tried all possible scenarios to achieve this. But, no luck.

Container Health Check Command

i tried with the below aws recommeded commands to verify container health checks from the listed url.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definition_healthcheck

  1. [ "CMD-SHELL", "curl -f http://localhost/ || exit 1" ]
  2. [ "CMD-SHELL" "curl -f 127.0.0.1 || exit 1" ]

I tried with above both commands. But, none of them are working as expected. Please help me to receive container valid health checkup commands

Below is my DockerFile

    FROM centos:latest
    RUN yum update -y
    RUN yum install httpd httpd-tools curl -y
    EXPOSE 80
    CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]
    HEALTHCHECK CMD curl --fail http://localhost:80/ || exit 1
    FROM microsoft/dotnet:2.1-aspnetcore-runtime AS base
    WORKDIR /app
    EXPOSE 80
    FROM microsoft/dotnet:2.1-sdk AS build
    WORKDIR /DockerDemoApi
    COPY ./DockerDemoApi.csproj DockerDemoApi/
    RUN dotnet restore DockerDemoApi/DockerDemoApi.csproj
    COPY . .
    WORKDIR /DockerDemoApi
    RUN dotnet build DockerDemoApi.csproj -c Release -o /app
    FROM build AS publish
    RUN dotnet publish DockerDemoApi.csproj -c Release -o /app
    FROM base AS final
    WORKDIR /app
    COPY --from=publish /app .
    ENTRYPOINT ["dotnet", "DockerDemoApi.dll"]

I have added curl command inside my container and its working. But, if i keep the same command in AWS Healthcheck task, its failing.

Task Definition JSON:

    {
     "ipcMode": null,
     "executionRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
     "containerDefinitions": [{
     "dnsSearchDomains": null,
     "logConfiguration": {
     "logDriver": "awslogs",
     "secretOptions": null,
     "options": {
      "awslogs-group": "/ecs/mall-health-check-task",
      "awslogs-region": "ap-south-1",
      "awslogs-stream-prefix": "ecs"
     }
     },
     "entryPoint": [],
     "portMappings": [
     {
      "hostPort": 80,
      "protocol": "tcp",
      "containerPort": 80
     }
     ],
     "command": [],
     "linuxParameters": null,
     "cpu": 256,
     "environment": [],
     "resourceRequirements": null,
     "ulimits": null,
     "dnsServers": null,
     "mountPoints": [],
     "workingDirectory": null,
     "secrets": null,
     "dockerSecurityOptions": null,
     "memory": null,
     "memoryReservation": 512,
     "volumesFrom": [],
     "stopTimeout": null,
     "image": "xxxx.dkr.ecr.ap-south-
     1.amazonaws.com/autoaml/api/dev/alpine:latest",
     "startTimeout": null,
     "dependsOn": null,
     "disableNetworking": null,
     "interactive": null,
     "healthCheck": null,
     "essential": true,
     "links": [],
     "hostname": null,
     "extraHosts": null,
     "pseudoTerminal": null,
     "user": null,
     "readonlyRootFilesystem": null,
     "dockerLabels": null,
     "systemControls": null,
     "privileged": null,
     "name": "sample-app"
     }
     ],
     "placementConstraints": [],
     "memory": "512",
     "taskRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
     "compatibilities": [
     "EC2",
     "FARGATE"
     ],
     "taskDefinitionArn": "arn:aws:ecs:ap-south-1:xxx:task-definition/mall- 
     health-check-task:9",
     "family": "mall-health-check-task",
     "requiresAttributes": [{
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "ecs.capability.execution-role-ecr-pull"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "ecs.capability.task-eni"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.ecr-auth"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.task-iam-role"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "ecs.capability.execution-role-awslogs"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
     }
     ],
     "pidMode": null,
     "requiresCompatibilities": [
     "FARGATE"
     ],
     "networkMode": "awsvpc",
     "cpu": "256",
     "revision": 9,
     "status": "ACTIVE",
     "proxyConfiguration": null,
     "volumes": []
     }
4
Is your health check failing or Fargate throwing an error?Haran
Yes. It is showing healthstatus as UNKNOWN always.Arjun
I do not see any issue with the command you are passing unless some issue with Container health. One more info, did you mark your container as 'essential=true"Haran
i didn't add that essential=true @Haran. But, what that will do?.Arjun
i added essential=true and still its showing sameArjun

4 Answers

9
votes

The Documentation mentions the following:

When registering a task definition in the AWS Management Console, use a comma separated list of commands which will automatically converted to a string after the task definition is created. An example input for a health check could be:

CMD-SHELL, curl -f http://localhost/ || exit 1

When registering a task definition using the AWS Management Console JSON panel, the AWS CLI, or the APIs, you should enclose the list of commands in brackets. An example input for a health check could be:

[ "CMD-SHELL", "curl -f http://localhost/ || exit 1" ]

enter image description here

Did you verify your health check command? I mean, http://127.0.0.0 is valid, right? Check your container returns success response when you hit http://127.0.0.0 (without port).

Below is the example Task Definition. This is to start the tomcat server in a container and checking the health (localhost:8080)

  1. Modify the task definition as per needs (like Role Arn )
  2. Create an ECS Service and map the task definition.
  3. Create the configured log group.
  4. Start ECS service and your task should show as Healthy.
{
  "ipcMode": null,
  "executionRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
  "containerDefinitions": [
      {
          "dnsSearchDomains": null,
          "logConfiguration": {
              "logDriver": "awslogs",
              "secretOptions": null,
              "options": {
                  "awslogs-group": "/test/test-task",
                  "awslogs-region": "us-east-2",
                  "awslogs-stream-prefix": "test"
              }
          },
          "entryPoint": null,
          "portMappings": [
              {
                  "hostPort": 8080,
                  "protocol": "tcp",
                  "containerPort": 8080
              }
          ],
          "command": null,
          "linuxParameters": null,
          "cpu": 0,
          "environment": [],
          "resourceRequirements": null,
          "ulimits": null,
          "dnsServers": null,
          "mountPoints": [],
          "workingDirectory": null,
          "secrets": null,
          "dockerSecurityOptions": null,
          "memory": null,
          "memoryReservation": null,
          "volumesFrom": [],
          "stopTimeout": null,
          "image": "tomcat",
          "startTimeout": null,
          "dependsOn": null,
          "disableNetworking": false,
          "interactive": null,
          "healthCheck": {
              "retries": 3,
              "command": [
                  "CMD-SHELL",
                  "curl -f http://localhost:8080/ || exit 1"
              ],
              "timeout": 5,
              "interval": 30,
              "startPeriod": null
          },
          "essential": true,
          "links": null,
          "hostname": null,
          "extraHosts": null,
          "pseudoTerminal": null,
          "user": null,
          "readonlyRootFilesystem": null,
          "dockerLabels": null,
          "systemControls": null,
          "privileged": null,
          "name": "tomcat"
      }
  ],
  "memory": "1024",
  "taskRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
  "family": "test-task",
  "pidMode": null,
  "requiresCompatibilities": [
      "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "512",
  "proxyConfiguration": null,
  "volumes": []
}
7
votes

The docker image you are using, does it have curl installed part of package?.

Based on your screenshot, it looks like you are using httpd:2.4 docker image directly. If so, then curl is not part of the package.

You need to create your own docker image from above httpd:2.4 as base. Below is sample Dockerfile content to get curl part of image.

Example -

FROM httpd:2.4
RUN apt-get update; \
    apt-get install -y --no-install-recommends curl;

then build the image and push it your dockerhub account or private docker repo.

docker build -t my-apache2 .
docker run -dit --name my-running-app -p 80:80 my-apache2

Now with above image, you should be able to get healthcheck command working.

https://hub.docker.com/_/httpd

https://github.com/docker-library/httpd/blob/master/2.4/Dockerfile

1
votes

I don't know why but change http://localhost by http://127.0.0.1 (not only 127.0.0.1) use to fix the problem.

I followed what was suggested here and it fixed my health check issues.

1
votes

Was facing the same issue and found a solution for my usecase:

Three containers in one task definition, which are

  1. Nginx sidecar
  2. Two NodeJs Applications

Using ecs-params.yml file to declare healthchecks:

version: 1
  task_definition:
  task_execution_role: ecsTaskExecutionRole
  ecs_network_mode: awsvpc
  task_size:
    mem_limit: 2GB
    cpu_limit: 1024
  services:
    nginx-sidecar:
      healthcheck:
        test: curl -f http://localhost || exit 0
        interval: 10s
        timeout: 3s
        retries: 3
        start_period: 5s

    <service 2>:
      healthcheck:
          test: curl -f http://localhost:3023 || exit 0
          interval: 10s
          timeout: 3s
          retries: 3
          start_period: 5s

    <service 3>:
      healthcheck:
        test:  ["CMD", "curl", "-f", "http://localhost:3019/health"]
        interval: 10s
        timeout: 3s
        retries: 3
        start_period: 5s

Make sure that curl is available in your docker file and you are able to call it locally too

My Dockerfile:

FROM node:14.17-alpine

RUN apk add --update curl

You can include either of these commands for healthcheck in ecs-params.yml:

 test: curl -f http://localhost || exit 0

 test:  ["CMD", "curl", "-f", "http://localhost"]

Both are valid in my usecase. Hope this helps since none of the other answers were working for me.