5
votes

I'm BigQuery and the Python API for BigQuery - virgin.

My task is: pull data from the google open dataset using BigQuery and Python API, load the data into the MySql/PostgreSQL local database (the one installed on my PC)

The way i see it (after my 2-hour research) is: pull data into a local csv file (using this kind of a sample code), load the data from the file into the database.

Does this resemble the right way of performing the task?

If not, could you please provide some advice/instruction?

Thanks in advance.

1
That's the right way to do.Pentium10
Can it be done omitting the csv file step? SOrt of on-the-go?Dennis
You could run the query and process the output developers.google.com/bigquery/docs/reference/v2/jobs/… But exporting into CSV would speed up the import into your SQL database.Pentium10
I've sort of already done it. Now i'm going to polish it off and i'll post the solution here. Because there is no such solution on the internet atmDennis

1 Answers

2
votes

The API method

The steps you listed above are correct. In summary, a developer would:

  1. Use the BigQuery API's "Jobs: insert" method to insert a data Extraction job - this results in the data being sent to a Google Cloud Storage bucket.

  2. Use the Google Cloud Storage API's "Objects: get" method to download the file locally.

  3. For extra credit, I would query the original BQ table using the "Tables: get" method to get the original BigQuery table schema, and use this to build the schema for the MySQL import step.

The WEB UI method

Why not just run whatever query you need using the Google BigQuery Web UI, then export the data into a Google Cloud Storage bucket via the UI? Why do you need to use Python?