I've read ECS Monitoring documentation, but not found how to alert on ECS Task memory limit hit with CloudWatch Events or Metrics help. I have situation, when ECS container breaks default task hard limit 512 Mb and restarts. CloudWatch Event triggers to ECS Task state change, e.g. from RUNNING to STOPPED, but in event detail "stoppedReason"
you may find only "Task failed ELB health checks in ...", despite I definitely know the actual reason was memory limit break and container murder from Docker side. Here is Event Rule Pattern:
{
"source": [
"aws.ecs"
],
"detail-type": [
"ECS Task State Change"
],
"detail": {
"lastStatus": [
"STOPPED"
]
}
}
CloudWatch MemoryUtilization
Metric for ServiceName
dimension doesn't help much either, because the minimum period (range) is 1 minute to trigger alert, but container kill-restart cycle runs quicker. It's not enough time to catch the spike. I guess the same is relevant for the ClusterName
dimension (in other words for entire cluster).
I wonder how to get notification about Task (Container, Container Instance) hard memory limit break?