0
votes

I searched for Python API to interact with google bigquery. And I found two packages provides similar APIs: Google BigQuery Client(Part of Google API Client Package googleapiclient) and Gcloud package gcloud.

Here are the documentation about using these two APIs for Bigquery: Google API Client:googleapiclient

https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/python/latest/index.html https://cloud.google.com/bigquery/docs/reference/v2/

Google Cloud package: gcloud

http://googlecloudplatform.github.io/gcloud-python/stable/bigquery-usage.html

Both packages are from google, and provides similar functionalities interacting with bigquery. I have the following confusions:

  1. It seems both package includes a wide range of functionalities of Google Cloud Platform. In my view, gcloud provides commandline tool and local environment setup. Generally, what are the differences of these two packages?

  2. In terms of python module, what are the differences of their usage?

  3. Is there any relation between these two packages?
  4. Which is more suitable for accessing Bigquery?
  5. What kind of job are they suitable for?
1

1 Answers

4
votes

The googleapiclient client is generated directly from the raw API definition (the definition is a json file, hosted here.)

Because it is automatically generated, it is not what any sane python programmer would do if they were trying to write a python client for BigQuery. That said, it is the lowest-level representation of the API.

The gcloud client, on the other hand, was what a group of more-or-less sane folks at Google came up with when they tried to figure out what a client should look like for BigQuery. It is really quite nice, and lets you focus on what's important rather than converting results from the strange f/v format used in the BigQuery API into something useful.

Additionally, the documentation for the gcloud API was written by a doc writer. The documentation for the googleapiclient was, like the code, automatically generated from a definition of the API.

My advice, having used both (and having, mostly unsuccessfully, helped design the BigQuery API to try to make the generated client behave reasonably), is to use the gcloud client. It will handle a bunch of low-level details for you and generally make your life easier.