I'm trying to setup google cloud composer monitor via terraform, and this is my "helloworld" code (which works but not fulfill my criteria of acceptence):
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "3.5.0"
}
}
}
provider "google" {
credentials = "some_credentials"
project = "some_project"
region = "some_region"
zone = "some_zone"
}
resource "google_monitoring_notification_channel" "basic" {
display_name = "Test name"
type = "email"
labels = {
email_address = "[email protected]"
}
}
resource "google_monitoring_alert_policy" "cloud_composer_job_fail_monitor" {
combiner = "OR"
display_name = "Fails testing on cloud composer tasks"
notification_channels = [google_monitoring_notification_channel.basic.id]
conditions {
display_name = "Failures count"
condition_threshold {
filter = "resource.type=\"cloud_composer_workflow\" AND metric.type=\"composer.googleapis.com/workflow/task/run_count\" AND resource.label.\"project_id\"=\"some_project\" AND metric.label.\"state\"=\"failed\" AND resource.label.\"location\"=\"some_region\""
duration = "60s"
comparison = "COMPARISON_GT"
threshold_value = 0
aggregations {
alignment_period = "3600s"
per_series_aligner = "ALIGN_COUNT"
}
}
}
documentation {
content = "Please checkout current incident"
}
}
Problem: By default, notifications are sent when an alerting policy is either triggered or resolved (google doc).
My question: I want to get an alert notification every 30 minutes (for example) when Cloud Composer jobs will fail till I or someone else will not resolve an incident (or I need to understand why the incident is not resolved automatically when the job stop failing)
Can someone help with this issue?
Thank you for your help!
ORas a combiner then you should always be notified if one of the condition is being met. If it is not triggered, you may need to check the logs. - Alex G