0
votes

When creating the autoscaling policy and cloudwatch alarm resource in Terraform v13, they are created fine. However, when load testing the endpoint, they successfully scale up instances, but fail to scale down when CPU utilization hits the necessary % for a period of time. The error looks like this:

"historySummary": "Failed to execute AutoScaling action: No step adjustment found for metric value [5.99763732496649, 2.7634547331059975] and breach delta -4.00236267503351"

Listed below are the terraform resources:

Autoscaling Policy -

resource "aws_appautoscaling_policy" "frontend_down" {
  name               = "${var.name}_frontend_scale_down"
  service_namespace  = "ecs"
  resource_id        = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.frontend.name}"
  scalable_dimension = "ecs:service:DesiredCount"

  step_scaling_policy_configuration {
    adjustment_type         = "ChangeInCapacity"
    cooldown                = 30
    metric_aggregation_type = "Maximum"

    step_adjustment {
      metric_interval_lower_bound = 0
      scaling_adjustment          = -1
    }
  }

  depends_on = [aws_appautoscaling_target.frontend_target]
}

Cloudwatch Alarm -

resource "aws_cloudwatch_metric_alarm" "frontend_service_cpu_low" {
  alarm_name          = "${var.name}_cpu_utilization_low_fe"
  comparison_operator = "LessThanOrEqualToThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/ECS"
  period              = "60"
  statistic           = "Average"
  threshold           = "10"

  dimensions = {
    ClusterName = var.ecs_cluster_name
    ServiceName = var.ecs_service_name_frontend
  }

  alarm_actions = [var.autoscaling_down_arn_frontend]

  tags = {
    Name        = "${var.name}-autoscaling"
    BillingCode = var.billing_code_tag
    Environment = var.environment_tag
  }
}
1
figured out the reason, it was because on scaling down policy I was using 'metric_interval_lower_bound' instead of 'metric_interval_upper_bound'. When scaling down it provides a negative delta when comparing to the alarm threshold and cloudwatch metric and thus 0 becomes the upper bound. When scaling up, you use lower bound as it provides a positive delta.narliecholler

1 Answers

0
votes

figured out the reason, it was because on scaling down policy I was using 'metric_interval_lower_bound' instead of 'metric_interval_upper_bound'. When scaling down it provides a negative delta when comparing to the alarm threshold and cloudwatch metric and thus 0 becomes the upper bound. When scaling up, you use lower bound as it provides a positive delta