14
votes

Is there an easy way to directly download all the data contained in a certain dataset on Google BigQuery? I'm actually downloading "as csv", making one query after another, but it doesn't allow me to get more than 15k rows, and rows i need to download are over 5M. Thank you

4
developers.google.com/bigquery/bigquery-browser-tool#exportdata states you have to export it as a table to google cloud storage if its >16k rows - x29a

4 Answers

8
votes

You can run BigQuery extraction jobs using the Web UI, the command line tool, or the BigQuery API. The data can be extracted

For example, using the command line tool:

First install and auth using these instructions: https://developers.google.com/bigquery/bq-command-line-tool-quickstart

Then make sure you have an available Google Cloud Storage bucket (see Google Cloud Console for this purpose).

Then, run the following command:

bq extract my_dataset.my_table gs://mybucket/myfilename.csv

More on extracting data via API here: https://developers.google.com/bigquery/exporting-data-from-bigquery

7
votes

Detailed step-by-step to download large query output

  1. enable billing

    You have to give your credit card number to Google to export the output, and you might have to pay.

    But the free quota (1TB of processed data) should suffice for many hobby projects.

  2. create a project

  3. associate billing to a project

  4. do your query

  5. create a new dataset

  6. click "Show options" and enable "Allow Large Results" if the output is very large

  7. export the query result to a table in the dataset

  8. create a bucket on Cloud Storage.

  9. export the table to the created bucked on Cloud Storage.

    • make sure to click GZIP compression

    • use a name like <bucket>/prefix.gz.

      If the output is very large, the file name must have an asterisk * and the output will be split into multiple files.

  10. download the table from cloud storage to your computer.

    Does not seem possible to download multiple files from the web interface if the large file got split up, but you could install gsutil and run:

    gsutil -m cp -r 'gs://<bucket>/prefix_*' .
    

    See also: Download files and folders from Google Storage bucket to a local folder

    There is a gsutil in Ubuntu 16.04 but it is an unrelated package.

    You must install and setup as documented at: https://cloud.google.com/storage/docs/gsutil

  11. unzip locally:

    for f in *.gz; do gunzip "$f"; done
    

Here is a sample project I needed this for which motivated this answer.

1
votes

yes steps suggested by Michael Manoochehri are correct and easy way to export data from Google Bigquery.
I have written a bash script so that you do not required to do these steps every time , just use my bash script . below are the github url : https://github.com/rajnish4dba/GoogleBigQuery_Scripts

scope :
1. export data based on your Big Query SQL.
2. export data based on your table name.
3. transfer your export file to SFtp server.
try it and let me know your feedback.
to help use ExportDataFromBigQuery.sh -h

1
votes

For python you can use following code,it will download data as a dataframe.

from google.cloud import bigquery
def read_from_bqtable(bq_projectname, bq_query):
    client = bigquery.Client(bq_projectname)
    bq_data = client.query(bq_query).to_dataframe()
    return bq_data  #return dataframe

bigQueryTableData_df = read_from_bqtable('gcp-project-id', 'SELECT * FROM `gcp-project-id.dataset-name.table-name` ')