1
votes

I'm trying to implement the JSON api (v2) of bigquery. In my code I get the same behaviour as on the documentation page for tabledata-list

My table size is about 11.000 rows. In the documentation page I fill in the following parameters:

  • ProjectId = X
  • DatasetId = Y
  • TableId = Z
  • MaxResults = 10000 #I want to paginate my results

This returns 10.000 rows and a pageToken. So I do the same request and now I set the page token so that I get the next page of results.

And that returns the same 10.000 rows as before. I expected this to do pagination as described on this page:

All collection.list methods return paginated results under certain circumstances. The number of results per page is controlled by the maxResults property

A page is a subset of the total number of rows. If your results are more than one page of data, the result data will have a nextPageToken property. To retrieve the next page of results, make another list call and include the token value as a URL parameter named pageToken.

Where do I go wrong?

EDIT:

My colleague pointed out to me that on the other documentation pages the result contains a nextPageToken except the response contains a pageToken. The difference being that where pageToken refers to the current page, the nextPageToken refers to the next page.

However the documentation states it should return a nextPageToken (except when there is no more data). But len(table) > len(result)

2

2 Answers

2
votes

On the same page it's mentioned that there is a difference for TableData.List() call

The bigquery.tabledata.list method, which is used to page through table data, uses a row offset value or a page token.

So for TableData.List() you must use the row offset value to paginate, and in order to access previous pages you can use your hashes from your session. This is built because with large volume and big data, you cannot pre-cache the next set of data from your worker pool.

You can help improving the documentation, by using the link on top right of each page that says: Feedback on this document feel free to use that to reach out with improvements.

Also you can submit issues to https://code.google.com/p/google-bigquery/issues/list

1
votes

Unfortunately, the field returned for TableData.List() that contains the logical "next page token" is literally named "pageToken", rather than "nextPageToken".

Other APIs, like Datasets.List(), return a field literally named "nextPageToken" which contains the logical "next page token".

It's a case of inconsistent naming, but hopefully this helps clear up some confusion.