5
votes

So, I am trying to automate our GAE Datastore backups using cron.yaml. Furthermore, I would like to use Google Cloud Storage as the destination for our backups. I have created a bucket and set up the ACL. Manual backups work from the Datastore Admin console. I can even get the cron to work. But, we push the same codebase to 3 different environments: dev, staging, production. So, I would like to separate the backups in different buckets based on the application name.

I would like staging datastore to go to myapp_staging_bk bucket, dev in myapp_dev_bk bucket, and live to myapp_live_bk.

cron.yaml:

    cron:
- description: My Daily Backup
  url: /_ah/datastore_admin/backup.create?name=BackupToCloud&kind=LogTitle&kind=EventLog&filesystem=gs&gs_bucket_name=whitsend
  schedule: every 12 hours
  target: ah-builtin-python-bundle

All of this would be super easy if I could figure out a way to pull the application name in the above url. Something like this:

url: /_ah/datastore_admin/backup.create?name=BackupToCloud&kind=LogTitle&kind=EventLog&filesystem=gs&{myapp}_bk=whitsend
  schedule: every 12 hours

where {myapp} would be the name of the app that's in app.yaml.

https://developers.google.com/appengine/articles/scheduled_backups doesn't say anything about this type of a setup.

I know I could pull this off with our CI server, but I would like to avoid this.

Does anyone have any suggestions?

2

2 Answers

3
votes

Modify the cron handler to call your own code, then either call the code to start the backup from your own code, or URLFetch it from your own code, after filling in the bucket name parameter based on your App ID.

0
votes

To precise what Nick said, you can use the taskqueue API in your code. That's what cron jobs do under the hood.

e.g. if you use python :

task = taskqueue.add(
    url='/_ah/datastore_admin/backup.create',
    target='ah-builtin-python-bundle',
    params={
        'name': 'my_backup',
        'kind': ['kind1','kind2','kind3'],
        'filesystem':'gs',
        'gs_bucket_name':'[MY_GCS_BUCKET]',
    })

response.write(
    'Task {} enqueued, ETA {}.'.format(task.name, task.eta))