2
votes

I've created an Google Cloud Compute Engine Instance Template where I specify a docker container image to be used (hosted privately on Google Container Registry). For my app I need to have Google Cloud Sql Proxy running, so I followed these steps and added a startup script on my Compute Engine Instance Template like this:

#! /bin/bash
wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 -O /var/lib/google/cloud_sql_proxy
chmod 777 /var/lib/google/cloud_sql_proxy
sudo /var/lib/google/cloud_sql_proxy -instances={instance name} &

The issue I have is that when I create an Compute Engine VM Instance based on this template

gcloud compute instances create {instance name} --source-instance-template {template name}

The instance is created and starts and I can see that cloud_sql_proxy script is running, BUT the docker image is not pulled neither does the container starts...

I've tried to create an Compute Engine VM Instance without specifying the startup script and it works correctly, the docker image is pulled and the container starts running. And with the Compute Instance running, I've connect via ssh and start the cloud_sql_proxy script manually and everything works (the apps connects successfully to the SQL instance on Google Cloud SQL). But I want to have this automated...

What am I missing? Has anybody had this issue?

2
Did you end up solving this?Dominic Bou-Samra
Few questions: 1. If you are going to create an instance template why do wget each time? 2. For your application why not create a stater script, to run the cloud sql proxy and then start you application. Single Application Repo and segregated config ==> easy to debug. 3. Missing chmod +x 4. If. you do, sudo execution then why chmod 777sam

2 Answers

3
votes

It could be that the startup script blocks somehow the container launcher. The latter is the konlet-startup.service, that in particular depends on the cloud registry service and the startup scripts service in the systemd configuration:

[Unit]
Description=Containers on GCE Setup
Wants=network-online.target google-startup-scripts.service gcr-online.target docker.socket
After=network-online.target google-startup-scripts.service gcr-online.target docker.socket

To debug the issue you can enforce the gcr-online.target at the beginning of your startup script by inserting systemctl start gcr-online.target at the beginning.

Also, you should make sure that the startup script exits and doesn't keep running endlessly since the konlet-startup.service is awaiting for the google-startup-scripts.service to report successful exit code: exit 0.

0
votes

For anyone in devops hell (running endless terraform applies as you debug an issue), here is my startup script for a container optimized OS instance that starts cloud_sql_proxy as part of it's startup script:

sudo wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 -O /var/lib/docker/cloud_sql_proxy
sudo chmod +x /var/lib/docker/cloud_sql_proxy
sudo nohup /var/lib/docker/cloud_sql_proxy -instances=${var.db_instance_uri} </dev/null &>/dev/null &

This will start the proxy, and then let the startup_script exit 0, and let konlet start your container.