In short, I want to deploy my Nginx and Node.js docker image to AWS ECS. To build the infra, I'm using Terraform. However, the task running in the server keeps failling. Also I got 503 Service Temporarily Unavailable
when accessing my domain bb-diner-api-https.shaungc.com.
(You can see my entire project repo here, but I'll embed links below and walk you through specific related files.)
After terraform apply
it reports 15 resources created and I can see the service & task running in the ECS web portal. However, my task will always fail after awhile like below:
Because the health check always fails:
For nodejs, I have error code 137, which is caused by receiving shutdown signal. This means nodejs is not the cause - it's nginx failed too many health checks such that it terminates nodejs. For nginx, it shows no message at all after clicking in View logs in CloudWatch
(I did setup awslogs
in task definition).
My health check setting
Task definition container health check
Basically I prepared a route in nginx just for health check. In task definition > container_definition
(json format), I have a health check on container nginx
like this:
"command": ["CMD-SHELL","curl -f http://localhost/health-check || exit 1"]
, and in my nginx.conf I have:
...
server {
listen 80;
...
location /health-check {
# access_log off;
return 200 "I'm healthy!" ; # refer to https://serverfault.com/questions/518220/nginx-solution-for-aws-amazon-elb-health-checks-return-200-without-if
}
}
So I really don't know why the task is failing health checks.
Target Group health check for load balancer
I also created an Application Load Balancer for me to link my domain name on Route 53 to it. I notice there's another place doing health check: target group and application load balancer. The check failed here too and my instance status is draining
.
Security Group
I think I opened all the possible ports.
So Why Health Check Fails & What Else is Missing?
There're a lot of articles pointing out the Nginx configuration, PORT or inbound limitation (security group/target group) on AWS can be the common causes, and I took a look at all of them. I let nginx listen to 80, set the container port as 80, allow a wide range of inbound ports in security group. What else can I be missing?