What's the easiest way to authenticate into Google BigQuery when on a Google Compute Engine instance?
2 Answers
Make sure that your instance has the scope to access BigQuery first of all - you can decide this only at creation time.
in a bash script, get a oauth token by calling :
ACCESSTOKEN=`curl -s "http://metadata/computeMetadata/v1/instance/service-accounts/default/token" -H "X-Google-Metadata-Request: True" | jq ".access_token" | sed 's/"//g'`
echo "retrieved access token $ACCESSTOKEN"
now let's say you want a list of the data sets in a project :
CURL_URL="https://www.googleapis.com/bigquery/v2/projects/YOURPROJECTID/datasets"
CURL_OPTIONS="-s --header 'Content-Type: application/json' --header 'Authorization: OAuth $ACCESSTOKEN' --header 'x-goog-project-id:YOURPROJECTID' --header 'x-goog-api-version:1'"
CURL_COMMAND="curl --request GET $CURL_URL $CURL_OPTIONS"
CURL_RESPONSE=`eval $CURL_COMMAND`
the response in JSON format can be found in the variable CURL_RESPONSE
PS: I realize now that this question is tagged as Python, but same principles apply.
In Python:
AppAssertionCredentials
is a python class that allows a Compute Engine instance to identify itself to Google and other OAuth 2.0 servers, withour requiring a flow.
https://developers.google.com/api-client-library/python/
The project id can be read from the metadata server, so it doesn't need to be set as a variable.
https://cloud.google.com/compute/docs/metadata
The following code gets a token using AppAssertionCredentials, the project id from the metadata server, and instantiates a BigqueryClient with this data:
import bigquery_client
import urllib2
from oauth2client import gce
def GetMetadata(path):
return urllib2.urlopen(
'http://metadata/computeMetadata/v1/%s' % path,
headers={'Metadata-Flavor': 'Google'}
).read()
credentials = gce.AppAssertionCredentials(
scope='https://www.googleapis.com/auth/bigquery')
client = bigquery_client.BigqueryClient(
credentials=credentials,
api='https://www.googleapis.com',
api_version='v2',
project_id=GetMetadata('project/project-id'))
For this to work, you need to give the GCE instance access to the BigQuery API when creating it:
gcloud compute instances create <your_instance_name> --scopes storage-ro bigquery