0
votes

My question is setting when monitoring AWS metrics with stackdriver. I'm tried thing below but, alert(policy) is not working. How do I send alert(policy) with group settings? I dont want is single monitoring, I do want is group settings.

  • I completed stackdriver monitoring setting for aws accounts by role settings. for next, I settinged group settings alert(policy) metrics is below.
  • load average > 5
  • disk usage > 80%

there target is some ec2 instances, these is group settings.

  1. I complete settings for these. for next, did test of stress.
  2. I looked at the metrics. Then the graph exceeded the threshold.
  3. but not sended alert(policy), and not opened incidents.

below is details.

Alert(Policy) Creation

  1. go to [Alerting/ Policies/ TARGET POLICY]
  2. [Add Condition], for next select to [Metric Threshold]
  3. RESOURCE TYPE is Instance(EC2)
  4. APPLIES TO is Group
  5. Select group. This group is Including EC2 Instances.
  6. CONDITION TRIGGERS IF: Any Member Violates
  7. IF METRIC is [CPU Load Average(past 1m)
  8. CONDITION is above
  9. THRESHOLD is 5 load
  10. FOR is 1 minutes
  11. Write by name and Push [Save Policy]

Test of Stress

  1. ssh to target instances.
  2. Execute stress test.
  3. Confim the Load Average above reached 5.
  4. but not sended alert(policy)

Confirm the Stackdriver

  1. Confirm the above Load Average reached 5, with alert settings page.
  2. But not opened Incidents.

I Tried other settings

  • For GCP instances, alerts will work correctly. It is both group setting and single setting.
  • Alerts will work for AWS instances in single configuration, but not for group settings.

Version info

  • stackdriver
    • stackdriver-agent version: stackdriver-agent.x86_64 5.5.2-366.amzn1
  • aws
    • OS: Amazon Linux
    • VERSION: 2016.03
    • ID_LIKE: rhel fedora

more detail is please comments.

1

1 Answers

1
votes

If the agent wasn't configured correctly and is sending metrics to the wrong project, this could lead to the behavior described. This works for single instances but doesn't for group of instances. This might work for GCP because it's zero setup for monitoring GCE Instances. This causes any alerts which use group filters to not work.

https://cloud.google.com/monitoring/agent/troubleshooting#verify-project "If you are using an Amazon EC2 VM instance, or if you are using private-key credentials on your Google Compute Engine instance, then the credentials could be invalid or they could be from the wrong project. For AWS accounts, the project used by the agent must be the AWS connector project, typically named "AWS Link..."."

These instructions at https://cloud.google.com/monitoring/agent/troubleshooting#verify-running help verify that agent is sending metrics correctly.