2
votes

Follow this guide to create cluster autoscaler on AWS: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      containers:
        - image: gcr.io/google_containers/cluster-autoscaler:v0.6.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --nodes=2:4:k8s-worker-asg-1
          env:
            - name: AWS_REGION
              value: us-east-1
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

I have changed k8s-worker-asg-1 to my current ASG name which created by kops. But when run kubectl apply -f deployment.yaml and check pods kubectl get pods -n=kube-system, return:

NAME                                                                      READY     STATUS             RESTARTS   AGE
cluster-autoscaler-75ccf5b9c9-lhts8                                       0/1       CrashLoopBackOff   6          8m

I tried to see its logs kubectl logs cluster-autoscaler-75ccf5b9c9-lhts8 -n=kube-system, return:

failed to open log file "/var/log/pods/8edc3073-dc0b-11e7-a6e5-06361ac15b44/cluster-autoscaler_4.log": open /var/log/pods/8edc3073-dc0b-11e7-a6e5-06361ac15b44/cluster-autoscaler_4.log: no such file or directory

I also tried to describe the pod kubectl describe cluster-autoscaler-75ccf5b9c9-lhts8 -n=kube-system, return:

the server doesn't have a resource type "cluster-autoscaler-75ccf5b9c9-lhts8"

So how to debug the issue? What will be the reason? Is it need storage on AWS? I didn't create any storage on AWS yet.


By the way, I have another question. If use kops create a k8s cluster on AWS, then change maxSize, minSize for nodes size:

$ kops edit ig nodes
> maxSize: 2
> minSize: 2
$ kops update cluster ${CLUSTER_FULL_NAME} --yes

Until now the Auto Scaling Groups on AWS has already became Min:2 Max:4.

Is it necessary to run this deployment again? https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws

Does kops can't change both ASG and k8s cluster? Why do another step to set cluster-autoscaler to kube-system namespace?

NAME                                                                      READY     STATUS             RESTARTS   AGE
cluster-autoscaler-75ccf5b9c9-lhts8                                       0/1       CrashLoopBackOff   6          8m
1
In the describe command, your are missing a "po". kubectl describe cluster-autoscaler-75ccf5b9c9-lhts8 -n=kube-system becomes kubectl describe po cluster-autoscaler-75ccf5b9c9-lhts8 -n=kube-systemwhites11
The problem was the name of cert is different from system's!online

1 Answers

0
votes

I have tried this official solution from K8s repositories. You also need to add additional IAM policies for accessing to AWS Autoscaling resources. Then, modify the script in https://github.com/kubernetes/kops/tree/master/addons/cluster-autoscaler to install Cluster Autoscaler on your K8s cluster. Note that you likely want to change AWS_REGION and GROUP_NAME, and probably MIN_NODES and MAX_NODES. I worked for me.

spec:
  api:
    loadBalancer:
      type: Public
  authorization:
    rbac: {}
  additionalPolicies:
    node: |
      [
        {
          "Effect": "Allow",
          "Action": [
            "autoscaling:DescribeAutoScalingGroups",
            "autoscaling:DescribeAutoScalingInstances",
            "autoscaling:SetDesiredCapacity",
            "autoscaling:TerminateInstanceInAutoScalingGroup"
          ],
          "Resource": ["*"]
        }
      ]