2
votes

I'm following Python Client Libraries for the Google BigQuery API - https://googlecloudplatform.github.io/google-cloud-python/stable/bigquery/usage.html#jobs > Querying data (asynchronous)

when it comes to Retrieve the results, executing the code:

rows, total_count, token = query.fetch_data()  # API requet

always return ValueError: too many values to unpack (expected 3) (btw I think there's a typo, it should be results.fetch_data() instead!)

However, the following code works fine

results = job.results()
rows = results.fetch_data()
tbl = [x for x in rows]

All the rows of the table are return (as list of tuples) in tbl in a singel shot, >225K rows!

Can anyone why I get the error, or is there anything wrong in the doc?

How can I still retrieve results in batches (iterating through page by page)

Thanks a lot in advance!

2

2 Answers

2
votes

A while ago i opened this issue asking for updating the docs but as you can see from the answers it still requires official release to change.

Please refer to the code base itself for better docstring (in this case specifically the class Iterator):

"""Iterators for paging through API responses.
These iterators simplify the process of paging through API responses
where the response is a list of results with a ``nextPageToken``.
To make an iterator work, you'll need to provide a way to convert a JSON
item returned from the API into the object of your choice (via
``item_to_value``). You also may need to specify a custom ``items_key`` so
that a given response (containing a page of results) can be parsed into an
iterable page of the actual objects you want. You then can use this to get
**all** the results from a resource::
    >>> def item_to_value(iterator, item):
    ...     my_item = MyItemClass(iterator.client, other_arg=True)
    ...     my_item._set_properties(item)
    ...     return my_item
    ...
    >>> iterator = Iterator(..., items_key='blocks',
    ...                     item_to_value=item_to_value)
    >>> list(iterator)  # Convert to a list (consumes all values).
Or you can walk your way through items and call off the search early if
you find what you're looking for (resulting in possibly fewer
requests)::
    >>> for my_item in Iterator(...):
    ...     print(my_item.name)
    ...     if not my_item.is_valid:
    ...         break
At any point, you may check the number of items consumed by referencing the
``num_results`` property of the iterator::
    >>> my_iterator = Iterator(...)
    >>> for my_item in my_iterator:
    ...     if my_iterator.num_results >= 10:
    ...         break
When iterating, not every new item will send a request to the server.
To iterate based on each page of items (where a page corresponds to
a request)::
    >>> iterator = Iterator(...)
    >>> for page in iterator.pages:
    ...     print('=' * 20)
    ...     print('    Page number: %d' % (iterator.page_number,))
    ...     print('  Items in page: %d' % (page.num_items,))
    ...     print('     First item: %r' % (next(page),))
    ...     print('Items remaining: %d' % (page.remaining,))
    ...     print('Next page token: %s' % (iterator.next_page_token,))
    ====================
        Page number: 1
      Items in page: 1
         First item: <MyItemClass at 0x7f1d3cccf690>
    Items remaining: 0
    Next page token: eav1OzQB0OM8rLdGXOEsyQWSG
    ====================
        Page number: 2
      Items in page: 19
         First item: <MyItemClass at 0x7f1d3cccffd0>
    Items remaining: 18
    Next page token: None
To consume an entire page::
    >>> list(page)
    [
        <MyItemClass at 0x7fd64a098ad0>,
        <MyItemClass at 0x7fd64a098ed0>,
        <MyItemClass at 0x7fd64a098e90>,
    ]
0
votes

Yes, you are correct about the document. There is a typo -

results = job.results()

rows, total_count, token = query.fetch_data() # API requet

while True:

    do_something_with(rows)

     if token is None:

          break

    rows, total_count,token=query.fetch_data(page_token=token)       # API requeste here

For the big dataset we do hourly query to fetch the data in our daily job.