26
votes

Im trying to make a project that will upload google storage json file to BigQuery (just automate something that is done manually now).

And i'd like to use 'service account' for this as my script is going to be run on daily basis.

After reading everything i can found about using service account im still struggling to authenticate.

I wonder if someone could check and point me to what i missed?

Here is what i've done so far:

  1. Created json key file for service account
  2. Installed client libraries: pip install --upgrade google-cloud-bigquery
  3. Installed google cloud sdk according to: https://cloud.google.com/sdk/docs/
  4. Run export GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file> with key path specified correctly

Now im trying to run the following python script:

from google.cloud import bigquery
bigquery_client = bigquery.Client()

i get this error:

google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credential and re-run the application. For more information, please see https://developers.google.com/accounts/docs/application-default-credentials.

Im quite new to both python and google cloud API so possbily missed something,

Wonder if someone could point out where/what was wrong in my steps above or point me to clear instruction for dummys about setting up and running simple script with Bigquery using service account?

4

4 Answers

43
votes

I usually set this variable in the python script itself, something like:

import os
from google.cloud.bigquery.client import Client

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_file'
bq_client = Client()
19
votes

If you have implemented more fine-grained control over service accounts permissions and you have an app that needs to use several of them (say one for Pub/Sub, one for storage), then you would have to set the GOOGLE_APPLICATION_CREDENTIALS environment variable before creating each client.

Instead, you can load your credentials separately and pass them to their appropriate clients like so:

import json

from google.cloud import storage
from google.oauth2 import service_account

project_id = 'my-test-project'

with open('/path/to/my/service_keys/storage_service.json') as source:
    info = json.load(source)

storage_credentials = service_account.Credentials.from_service_account_info(info)

storage_client = storage.Client(project=project_id, credentials=storage_credentials)

Just make sure in your IAM console that the account has the right permissions to perform the operations you need it to do, but luckily in that case the error messages are really informative.

9
votes

Are you running the script in the same command line session as the one you set your environment variable in using export? If not, you might want to look into setting it for your user or system (see this question for more info).

Another option that might make things even easier and takes care of these things automatically is using the gcloud cli tool. If you look at the second option here under the How the Application Default Credentials work, you can find info on how to use it to manage the credentials for you (gcloud auth login and gcloud auth application-default login)

4
votes

This is an old question however I want to add that you must create a new service account and not use an old one. A recent Google Cloud Next presentation on security stated that there is no guarantee that the default service account will exist in future, and it could be removed at any time (or its available permissions changed), so none of your applications should depend on it. Also I've found that there are potential issues with authentication when using the default service account and creating a new one is more likely to allow the control you need to successfully authenticate.

Refer to the following YouTube presentation from 11mins 10s in:

https://youtu.be/ZQHoC0cR6Qw?t=670