2
votes

I assume there are no stupid questions, so here is one that I could not find a direct answer to.

The situation

I currently have a Kubernetes-cluster running 1.15.x on AKS, deployed and managed through Terraform. AKS recently Azure announced that they would retire the 1.15 version of Kubernetes on AKS, and I need to upgrade the cluster to 1.16 or later. Now, as I understand the situation, upgrading the cluster directly in Azure would have no consequences for the content of the cluster, I.E nodes, pods, secrets and everything else currently on there, but I can not find any proper answer to what would happen if I upgrade the cluster through Terraform.

Potential problems

So what could go wrong? In my mind, the worst outcome would be that the entire cluster would be destroyed, and a new one would be created. No pods, no secrets, nothing. Since there is so little information out there, I am asking here, to see if there are anyone with more experience with Terraform and Kubernetes that could potentially help me out.

To summary:

Terraform versions

Terraform v0.12.17
+ provider.azuread v0.7.0
+ provider.azurerm v1.37.0
+ provider.random v2.2.1

What I'm doing

§ terraform init 

//running terrafrom plan with new Kubernetes version declared for AKS

§ terraform plan 

//Following changes are announced by Terraform:



An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  #module.mycluster.azurerm_kubernetes_cluster.default will be updated in-place...

         ...
         ~ kubernetes_version              = "1.15.5" -> "1.16.13"
         ...


Plan: 0 to add, 1 to change, 0 to destroy.

What I want to happen

Terraform will tell Azure to upgrade the existing AKS-service, not destroy before creating a new one. I assume that this will happen, as Terraform announces that it will "update in-place", instead of adding new and/or destroying existing clusters.

2

2 Answers

3
votes

I'd say this shows that the Terraform method is non-destructive, even if there have at times been oversights in the upgrade process (but still non-destructive in this example): https://github.com/terraform-providers/terraform-provider-azurerm/issues/5541

If you need higher confidence for this change then you could alternativly consider using the Azure-based upgrade method, refreshing the changes back into your state, and tweaking the code until a plan generation doesn't show anything intolerable. The two azurerm_kubernetes_cluster arguments dealing with version might be all you need to tweak.

0
votes

I found this question today and thought I'd add my experience as well. I made the following changes:

  1. Changed the kubernetes_version under azurerm_kubernetes_cluster from "1.16.15" -> "1.17.16"
  2. Changed the orchestrator_version under default_node_pool from "1.16.15" -> "1.17.16"
  3. Increased the node_count under default_node_pool from 1 -> 2

A terraform plan showed that it was going to update in-place. I then performed a terraform apply which completed successfully. kubectl get nodes showed that an additional node was created, but both nodes in the pool were still on the old version. After further inspection in Azure Portal it was found that only the k8s cluster version was upgraded and not the version of the node pool. I then executed terraform plan again and again it showed that the orchestrator_version under default_node_pool was going to be updated in-place. I then executed terraform apply which then proceeded to upgrade the version of the node pool. It did that whole thing where it creates an additional node in the pool (with the new version) and sets the status to NodeSchedulable while setting the existing node in the pool to NodeNotSchedulable. The NodeNotSchedulable node is then replaced by a new node with the new k8s version and eventually set to NodeSchedulable. It did this for both nodes. Afterwards all nodes were upgraded without any noticeable downtime.