0
votes

I am importing financial data using JSON from the web into excel, but as the source uses pagination (giving 50 results per page I need to implement pagination in order to import all the results.

The data source is JSON:

    https://localbitcoins.com//sell-bitcoins-online/VES/.json?page=1 
    or https://localbitcoins.com//sell-bitcoins-online/VES/.json?page=2

?page=1, ?page=2, ?page=3

I use the following code to implement pagination, but receive an error:

= (page as number) as table =>
let
    Source = Json.Document(Web.Contents("https://localbitcoins.com//sell-bitcoins-online/VES/.json?page="  & Number.ToText(page) )),
    Data1 = Source{1}[Data],
    RemoveBottom = Table.RemoveLastN(Data1,3)
in
    RemoveBottom

When I envoke a parameter (1 for page 1) to test it I get the following error and I can't seem to find out why?

An error occurred in the ‘GetData’ query. Expression.
Error: We cannot convert a value of type Record to type List.
Details:
    Value=Record
    Type=Type

For the record, I try to include page handling using ListGenerate:

= List.Generate( ()=>
[Result= try GetData(1) otherwise null, page = 1],
        each [Result] <> null,
        each [Result = try GetData([page]+1) otherwise null, Page = [Page]+1],
        each [Result])

What is the default way to implement pagination using Power Query in MS Excel?

1

1 Answers

1
votes

I realise you asked this nearly a month ago and may have since found an answer, but will respond anyway in case it helps someone else.

This line Data1 = Source{1}[Data] doesn't make sense to me, since I think Source will be a record and you can't use {1} positional lookup syntax with records.

The code below returns 7 pages for me. You may want to check if it's getting all the pages you need/expect.

let
    getPageOfData = (pageNumber as number) =>
        let
            options = [
                Query = [page = Number.ToText(pageNumber)]
            ],
            url = "https://localbitcoins.com/sell-bitcoins-online/VES/.json",
            response = Web.Contents(url, options),
            deserialised = Json.Document(response)
        in deserialised,
    responses = List.Generate(
        () => [page = 1, response = getPageOfData(page), lastPage = null],
        each [lastPage] = null or [page] <= [lastPage],
        each [
            page = [page] + 1,
            response = getPageOfData(page),
            lastPage = if [lastPage] = null then if Record.HasFields(response[pagination], "next") then null else page else [lastPage]
        ],
        each [response]
    )
in
    responses

In List.Generate, my selector only picks the [response] field to keep things simple. You could drill deeper into the data either within selector itself (e.g. each [response][data][ad_list]) or create a new step/expression and use List.Transform to do so.

After a certain amount of drilling down and transforming, you might see some data like:

Output

but that depends on what you need the data to look like (and which columns you're interested in).


By the way, I used getPageOfData in the query above, but this particular API was including the URL for the next page in its responses. So pages 2 and thereafter could have just requested the URL in the response (rather than calling getPageOfData).