1
votes

I have a table that has data inserted in a specific order. Am I guaranteed that by using say 0-of-10 partition decorator, the query is run over the first 10% of the table in the order in which the data was inserted, 1-of-10 partition decorator would query over the next 10% of the data in the order they were inserted etc.?

Also does count(*) on partition decorators result in a table scan (and hence cost non-zero amount) or does it only use metadata (and hence is free)?

1

1 Answers

1
votes

In BigQuery insert doesn't guarantee you an order, so you should not be confused. Inserts are async jobs that run in parallel on multiple ingesting nodes so there is no atomic order defined. Also not just in BigQuery but other database engines don't assure you data retrieval in the order of inserts, so this is a flaw in your thinking. If you need atomic order than you should check out other dbs like Redis which have data types specific for these kind of problems.

  1. SELECT title FROM [publicdata:samples.wikipedia@0] in this example @0 means something like "show me how the table looked 7 days ago" (with 7 days being the oldest available snapshot. So it's just a snapshot of the table, and not something that can be related to order of data.

When you retrieve the data you don't have any assurance the data is returned in insert order.

  1. Very simple counts are only using meta data, and the costs are free. But some counts take full table scans, and produce costs, mostly because of the where conditions.