0
votes

I can create a regular GKE cluster and pull the docker image I need and get it running. When I create the GKE cluster with a routing rule through a NAT my user no longer has permission to pull the docker image.

I start the cluster with these settings:

resources:
######## Network ############
- name: gke-nat-network
  type: compute.v1.network
   properties: 
    autoCreateSubnetworks: false
######### Subnets ##########
######### For Cluster #########
- name: gke-cluster-subnet 
  type: compute.v1.subnetwork
   properties:
    network: $(ref.gke-nat-network.selfLink)
     ipCidrRange: 172.16.0.0/12
     region: us-east1
 ########## NAT Subnet ##########
 - name: nat-subnet
  type: compute.v1.subnetwork
   properties: 
    network: $(ref.gke-nat-network.selfLink)
    ipCidrRange: 10.1.1.0/24
    region: us-east1
########## NAT VM ##########
- name: nat-vm
  type: compute.v1.instance 
   properties:
    zone: us-east1-b
    canIpForward: true
    tags:
      items:
      - nat-to-internet
    machineType: https://www.googleapis.com/compute/v1/projects/{{ 
env["project"] }}/zones/us-east1-b/machineTypes/f1-micro
    disks:
      - deviceName: boot
        type: PERSISTENT
        boot: true
        autoDelete: true
        initializeParams:
          sourceImage: 
https://www.googleapis.com/compute/v1/projects/debian- 
cloud/global/images/debian-7-wheezy-v20150423
     networkInterfaces:
     - network: projects/{{ env["project"] }}/global/networks/gke-nat- 
 network
      subnetwork: $(ref.nat-subnet.selfLink)
       accessConfigs:
       - name: External NAT
         type: ONE_TO_ONE_NAT
     metadata:
       items:
       - key: startup-script
        value: |
          #!/bin/sh
          # --
          # ---------------------------
          # Install TCP DUMP
          # Start nat; start dump
          # ---------------------------
          apt-get update
          apt-get install -y tcpdump
          apt-get install -y tcpick 
          iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
          nohup tcpdump -e -l -i eth0 -w /tmp/nat.pcap &
          nohup tcpdump -e -l -i eth0 > /tmp/nat.txt &
          echo 1 | tee /proc/sys/net/ipv4/ip_forward
 ########## FIREWALL RULES FOR NAT VM ##########
 - name: nat-vm-firewall 
   type: compute.v1.firewall
   properties: 
    allowed:
    - IPProtocol : tcp
      ports: []
    sourceTags: 
    - route-through-nat
    network: $(ref.gke-nat-network.selfLink)
 - name: nat-vm-ssh
  type: compute.v1.firewall
  properties: 
    allowed:
     - IPProtocol : tcp
       ports: [22]
     sourceRanges: 
    - 0.0.0.0/0
    network: $(ref.gke-nat-network.selfLink)
 ########## GKE CLUSTER CREATION ##########
 - name: nat-gke-cluster
   type: container.v1.cluster
   metadata: 
   dependsOn:
   - gke-nat-network 
   - gke-cluster-subnet
   properties: 
    cluster: 
      name: nat-gke-cluster 
      initialNodeCount: 1
      network: gke-nat-network
      subnetwork: gke-cluster-subnet
      nodeConfig:
        machineType: n1-standard-4
        tags:
        - route-through-nat
    zone: us-east1-b
########## GKE MASTER ROUTE ##########
- name: master-route
  type: compute.v1.route
  properties:
    destRange: $(ref.nat-gke-cluster.endpoint)
    network: $(ref.gke-nat-network.selfLink)
    nextHopGateway: projects/{{ env["project"] 
}}/global/gateways/default-internet-gateway
    priority: 100
    tags:
    - route-through-nat
########## NAT ROUTE ##########
 - name: gke-cluster-route-through-nat
  metadata: 
    dependsOn:
    - nat-gke-cluster  
    - gke-nat-network
   type: compute.v1.route
   properties: 
    network: $(ref.gke-nat-network.selfLink)
     destRange: 0.0.0.0/0
     description: "route all other traffic through nat"
     nextHopInstance: $(ref.nat-vm.selfLink)
    tags:
    - route-through-nat
    priority: 800

When I try to pull and start a docker image I get:

ImagePullBackOff error Google Kubernetes Engine

When I do kubectl describe pod I get:

Failed to pull image : rpc error: code = Unknown desc = unauthorized: authentication required

Edit:

I have found out that the gcloud console command has changed since v1.10 https://cloud.google.com/kubernetes-engine/docs/how-to/access-scopes

Basically certain roles are not allowed by default for these clusters which includes pulling an image from google storage.

I am still having trouble figuring out how assign these roles while using

gcloud deployment-manager deployments create gke-with-nat --config gke-with-nat-route.yml

1
What is the out put of contents of the regcred secret that you created? (Please remove personal information before posting). To pull the image from the private registry, Kubernetes needs credentials. The imagePullSecrets field in the configuration file specifies that Kubernetes should get the credentials from a Secret named regcred.Milad Tabrizi
I figured out the issue. With gcloud 1.10 it no longer gives storage-ro scope permissions by default. cloud.google.com/kubernetes-engine/docs/how-to/access-scopes . Now I need to figure out how to specify scopes using gcloud deployment-manager deployments create.Apothan

1 Answers

1
votes

So the reason the container images were not pulling is because gcloud clusters have changed how they handle permissions. It used to grant the 'storage-ro' role to new clusters allowing them to pull container images from the container registry. As per https://cloud.google.com/kubernetes-engine/docs/how-to/access-scopes .

I had to add scopes to the YML cluster deployment as I create my deployment using

gcloud deployment-manager deployments create gke-with-nat --config gke-with-nat-route.yml

The new YML included these settings

nodeConfig:
    serviceAccount: [email protected]
    oauthScopes:
      - https://www.googleapis.com/auth/devstorage.read_only

If you are using cluster create I think you can use

gcloud container clusters create example-cluster --scopes scope1,scope2

If you are using the website UI I think you can choose to use the legacy setting using a checkbox in the UI. I am not sure how long this will be supported.