1
votes

In traditional data modeling, I create hourly and daily rollup table to reduce data storage and improve query response time. However, the attempt to create similar rollup table easily run into "Response too large to return" error. What is the recommended method to create rollup table with BigQuery? I need to reduce data to reduce cost of storage and query.

Thx!

2

2 Answers

0
votes

It sounds like you are appending all of your data to a single table, then want to create smaller tables to query over ... is that correct?

One option would be to load your data in the hourly slices, then create the daily and 'all' tables by performing table copy operations with write_disposition=WRITE_APPEND. Alternately, you can use multiple tables in your queries. For example select foo from table20130101,table20130102,table20130102. (Note this does not do a join, it does a UNION ALL. It is a quirk of the bigquery query syntax).

If it will be difficult to change the layout of your tables, there isn't currently support for larger query result sizes, but it is something that is one of our most requested features and we have it a high priority.

Also, creating smaller tables won't necessarily improve query performance, since bigquery processes queries in parallel to the extent possible. It won't reduce storage costs, unless you're only going to store part of the table. It will, of course, reduce the costs of a query, since running queries against larger tables is more expensive.

If you describe your scenario a bit more I may be able to offer more concrete advice.

1
votes

A recently announced BigQuery feature allows large results!

Now you can specify a flag and a destination table. Results of arbitrary size will be stored in the designated table.

https://developers.google.com/bigquery/docs/queries#largequeryresults