1
votes

I'm observing a strange behaviour of a newly created cluster in GKE. Just after creating it, there is one node. When I create my first namespace, it autoscales up to 2 nodes, although the resources on the first node are still very low. What could be the cause of that and a way to prevent it? I've created my cluster with the following definition (using python API):

            cluster={
                "name": "mycluster",
                "initial_cluster_version": "latest",
                "network_policy": {
                    "enabled": True,
                    "provider": "PROVIDER_UNSPECIFIED"
                },
                "node_pools": [
                    {
                        "name": "default",
                        "autoscaling": {
                            "enabled": True,
                            "max_node_count": 5,
                            "min_node_count": 1
                        },
                        "config": {
                            "image_type": "UBUNTU",
                            "machine_type": "n1-standard-4",
                            "oauth_scopes": [
                                # Allows pulling images from GCR
                                "https://www.googleapis.com/auth/devstorage.read_only",

                                # Needed for monitoring
                                "https://www.googleapis.com/auth/logging.write",
                                "https://www.googleapis.com/auth/monitoring"
                            ]
                        },
                        "initial_node_count": 1
                    }
                ]
            },
1
Hello. Did you check if this behavior is the same when cluster is created by GCP Dashboard? Are you running anything on your cluster that could invoke the cluster autoscaler. Please take a look on official documentation which shows cluster autoscaler events: Cloud.google.com: Kubernetes Engine: Cluster autoscaler visibilityDawid Kruk
Hi, when creating the cluster by GCP Dashboard (using similar definition), the behavior is the same. Once the cluster is created (so it's only running what it needs to), I do nothing more than creating a namespace and boom, a new node is added to the pool. Looking at the events, it's not very clear to me what's happening: I've added the logs in my original question.Alain B.

1 Answers

0
votes

TL;DR

Your cluster does not scale-up when you create a namespace.

Here is the reason:

Limitations and requirements

Your cluster must have at least 2 nodes of type n1-standard-1 or higher. The recommended minimum size cluster to run network policy enforcement is 3 n1-standard-1 instances.

Cloud.google.com: Kubernetes Engine: Network Policy: Limitations and requirements

The fact that you created your GKE cluster with inital node count of 1 caused the calico-typha-XXX to send a request to scale-up the cluster to minimum of 2 nodes.


Assume the following:

  • GKE cluster with release channel of Regular
  • Autoscaling enabled with:
    • inital node count: 1 node
    • minimum: 1 node
    • maximum: 3 nodes
  • Nodes with machine type: n1-standard-1 or higher
  • Network Policy enabled.

When you create cluster with above requirements you will get a cluster with 1 node. This will change as soon as the calico-typha-XXX-XXX will detect that the amount of nodes is less than 2 and it will send a request to scale-up.

You can get more detailed logs about this by issuing commands:

  • $ kubectl get pods -A
  • $ kubectl describe pod -n kube-system calico-typha-XXX-XXX

You should get the part of the output similar to this:

  Normal   TriggeredScaleUp  18m   cluster-autoscaler                              pod triggered scale-up: [{https://content.googleapis.com/compute/v1/projects/REDACTED/zones/europe-west3-c/instanceGroups/gke-ubuntu-grp 1->2 (max: 3)}]

You can also look in Kubernetes events log:

  • kubectl get events -A

Please take in mind that parameter -A is causing to output much more valuable information like:

kube-system   3m6s        Normal    TriggeredScaleUp          pod/calico-typha-6b8d44c954-7s9zx                                pod triggered scale-up: [{https://content.googleapis.com/compute/v1/projects/REDACTED/zones/europe-west3-c/instanceGroups/gke-ubuntu-grp 1->2 (max: 3)}]

Please take a look on additional documentation: