1
votes

I was going through the Google's Bigquery document available on their official website. I am little confused about Google's insert streaming quota policy. Here Following points are mentioned on the web page.

1]Maximum row size: 1 MB 
2]HTTP request size limit: 10 MB
3]Maximum rows per second: 100,000 rows per second, per table. Exceeding this amount will cause quota_exceeded errors.
4]Maximum rows per request: 500
5]Maximum bytes per second: 100 MB per second, per table. Exceeding this amount will cause quota_exceeded errors.

I am confuse to under stand the 3rd and 4th point. We can set data using new TableDataInsertAllRequest().setRows(rowList); rowList.size() is upto 100,000. For inserting we can use table().insertAll().execute().

But I am still confuse about 3rd and 4th point. Can any one explain this in details ? Thanks in advance.

1
Presumably you can insert up to 500 rows per request, and make up to 200 requests per second.Andy Turner
@AndyTurner Thanks :)vvp

1 Answers

1
votes

Suppose you use a lot of parallel workers to send streaming inserts, like thousands of servers in the same time.

If you SUM all those rows being streaming by your machines, together it might result more than 100k rows per second. Each server individually streams a maximum of 500, but together a large cluster can stream more then 100k per second. If you reach that you may need to contact support to raise your limit.

So you need to understand that one payload needs to be small and fit in 500 rows. If you want to stream more, you need to do in parallel the streaming. To achieve this for starting is good to have a message queue system like Beanstalkd and you can watch over your jobs using the beanstalkd admin console.