13
votes

I would like to develop an app engine application that directly stream data into a BigQuery table.

According to Google's documentation there is a simple way to stream data into bigquery:

Here is the sample code snippet on how streaming insert should be coded:

body = {"rows":[
{"json": {"column_name":7.7,}}
]}

response = bigquery.tabledata().insertAll(
   projectId=PROJECT_ID,
   datasetId=DATASET_ID,
   tableId=TABLE_ID,
   body=body).execute()

Although I've downloaded the client api I didn't find any reference to a "bigquery" module/object referenced in the above Google's example.

Where is the the bigquery object (from snippet) should be located?

Can anyone show a more complete way to use this snippet (with the right imports)?

I've Been searching for that a lot and found documentation confusing and partial.

2

2 Answers

11
votes

Minimal working (as long as you fill in the right ids for your project) example:

import httplib2
from apiclient import discovery
from oauth2client import appengine

_SCOPE = 'https://www.googleapis.com/auth/bigquery'

# Change the following 3 values:
PROJECT_ID = 'your_project'
DATASET_ID = 'your_dataset'
TABLE_ID = 'TestTable'


body = {"rows":[
    {"json": {"Col1":7,}}
]}

credentials = appengine.AppAssertionCredentials(scope=_SCOPE)
http = credentials.authorize(httplib2.Http())

bigquery = discovery.build('bigquery', 'v2', http=http)
response = bigquery.tabledata().insertAll(
   projectId=PROJECT_ID,
   datasetId=DATASET_ID,
   tableId=TABLE_ID,
   body=body).execute()

print response

As Jordan says: "Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset. Note that if you also want to use the robot to run queries, not just stream, you need the robot to be a member of the project 'team' so that it is authorized to run jobs."

3
votes

Here is a working code example from an appengine app that streams records to a BigQuery table. It is open source at code.google.com:

http://code.google.com/p/bigquery-e2e/source/browse/sensors/cloud/src/main.py#124

To find out where the bigquery object comes from, see http://code.google.com/p/bigquery-e2e/source/browse/sensors/cloud/src/config.py

Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset.

Note that if you also want to use the robot to run queries, not just stream, you need to robot to be a member of the project 'team' so that it is authorized to run jobs.