5
votes

What's the easiest way to authenticate into Google BigQuery when on a Google Compute Engine instance?

2

2 Answers

3
votes

Make sure that your instance has the scope to access BigQuery first of all - you can decide this only at creation time.

in a bash script, get a oauth token by calling :

ACCESSTOKEN=`curl -s "http://metadata/computeMetadata/v1/instance/service-accounts/default/token" -H "X-Google-Metadata-Request: True" | jq ".access_token" | sed 's/"//g'`
echo "retrieved access token $ACCESSTOKEN"

now let's say you want a list of the data sets in a project :

CURL_URL="https://www.googleapis.com/bigquery/v2/projects/YOURPROJECTID/datasets"
CURL_OPTIONS="-s --header 'Content-Type: application/json' --header 'Authorization: OAuth $ACCESSTOKEN' --header 'x-goog-project-id:YOURPROJECTID' --header 'x-goog-api-version:1'"
CURL_COMMAND="curl --request GET $CURL_URL $CURL_OPTIONS"    
CURL_RESPONSE=`eval $CURL_COMMAND`

the response in JSON format can be found in the variable CURL_RESPONSE

PS: I realize now that this question is tagged as Python, but same principles apply.

3
votes

In Python:

AppAssertionCredentials is a python class that allows a Compute Engine instance to identify itself to Google and other OAuth 2.0 servers, withour requiring a flow.

https://developers.google.com/api-client-library/python/

The project id can be read from the metadata server, so it doesn't need to be set as a variable.

https://cloud.google.com/compute/docs/metadata

The following code gets a token using AppAssertionCredentials, the project id from the metadata server, and instantiates a BigqueryClient with this data:

import bigquery_client
import urllib2
from oauth2client import gce

def GetMetadata(path):
  return urllib2.urlopen(
      'http://metadata/computeMetadata/v1/%s' % path,
      headers={'Metadata-Flavor': 'Google'}
      ).read()

credentials = gce.AppAssertionCredentials(
    scope='https://www.googleapis.com/auth/bigquery')

client = bigquery_client.BigqueryClient(
    credentials=credentials,
    api='https://www.googleapis.com',
    api_version='v2',
    project_id=GetMetadata('project/project-id'))

For this to work, you need to give the GCE instance access to the BigQuery API when creating it:

gcloud compute instances create <your_instance_name> --scopes storage-ro bigquery