8
votes

I have terraformed (Terraform version 11.10) a private Kubernetes cluster on the Google Kubernetes Engine (GKE) using the following .tf:

module "nat" {
  source     = "GoogleCloudPlatform/nat-gateway/google"
  region     = "europe-west1"
  network    = "default"
  subnetwork = "default"
}

resource "google_container_node_pool" "cluster_1_np" {
  name               = "cluster-1-np"
  region             = "europe-west1"
  cluster            = "${google_container_cluster.cluster_1.name}"
  initial_node_count = 1

  lifecycle {
    ignore_changes = ["node_count"]
  }

  autoscaling {
    min_node_count = 1
    max_node_count = 50
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/pubsub",
    ]

    tags = ["${module.nat.routing_tag_regional}"]
  }
}

resource "google_container_cluster" "cluster_1" {
  provider                 = "google-beta"
  name                     = "cluster-1"
  region                   = "europe-west1"
  remove_default_node_pool = true

  private_cluster_config {
    enable_private_endpoint = false
    enable_private_nodes    = true
    master_ipv4_cidr_block  = "172.16.0.0/28"
  }

  ip_allocation_policy {
    create_subnetwork = true
  }

  lifecycle {
    ignore_changes = ["initial_node_count", "network_policy", "node_config", "node_pool"]
  }

  node_pool {
    name = "default-pool"
  }

  addons_config {
    http_load_balancing {
      disabled = false
    }

    horizontal_pod_autoscaling {
      disabled = false
    }
  }

  master_authorized_networks_config {
    cidr_blocks = [
      {
        cidr_block   = "<MY_OFFICE_CIDR>"
        display_name = "Office"
      },
    ]
  }
}

Which works great, giving me a private cluster (and the NAT works, giving the nodes access to the internet), and machines in my office can run kubectl commands to interact with it no bother.

The problem I now face is integrating any web-based Continuous Integration (CI) or Continuous Deployment (CD). Private clusters are a new feature of the Google Cloud Platform (GCP), and the documentation is a bit lacking in this area.

My attempts thus far have completely failed, my networking knowledge is simply insufficient. I tried this solution but it seems the automation machine must be on the same network as the proxy.

I found this similar SO question (almost exactly the same but his is Cloud Build specific). In the comments to one of the answers of that question the OP mentions he found a workaround where he temporarily modifies the master authorized networks of the build machine but he has not stated exactly how he is carrying this out.

I attempted to replicate his workaround but the relevant gcloud commands seem to be able to update the list of networks, or completely remove all of them, not add/remove one at a time as his comment suggests.

Help from networking wizards would be much appreciated.

1

1 Answers

4
votes

This is a common problem while interfacing with CI systems like CircleCI or Travis that live in the public cloud. You can use this command to update your master authorized networks

gcloud container clusters update [CLUSTER_NAME] \
  --enable-master-authorized-networks \
  --master-authorized-networks=<MY_OFFICE_CIDR>,<NEW-CIDR-FROM-CI> \
  --zone=<your-zone>

To remove the CI system network you can do something like this (just remove the network from the cli):

gcloud container clusters update [CLUSTER_NAME] \
  --enable-master-authorized-networks \
  --master-authorized-networks=<MY_OFFICE_CIDR> \
  --zone=<your-zone>

To completely remove all authorized networks (disable):

gcloud container clusters update [CLUSTER_NAME] \
  --no-enable-master-authorized-networks 

You can also do it from the UI:

authorized networks

It's actually documented here.