My problem:
Through
New Query -> From Other Sources -> From Web
, I entered a static URL that allowed me to load approximately 60k "IDs" from a webpage in JSON format.- I believe each of these IDs corresponds to an item.
- So they're all loaded and organised in a column, with one ID per line, inside a Query tab.
- For the moment, no problem.
Now I need to import information from a dynamic URL that depends on the ID.
So I need to import from URL in this form:
http://www.example.com/xxx/xxxx/ID
- This imports the following for each ID:
- name of correspond item,
- average price,
- supply,
- demand,
- etc.
After research I came to the conclusion that I had to use the "Advanced Editor" inside the query editor to reference the ID query tab.
- However I have no idea how to put together the static part with the ID, and how to repeat that over the 60k lines.
I tried this:
let
Source = Json.Document(Web.Contents("https://example.com/xx/xxxx/" & ID)),
name1 = Source[name]
in
name1
This returns an error.
I think it's because I can't add a string and a column.
Question: How do I reference the value of the cell I'm interested in and add it to my string ?
Question: Is what I'm doing viable?
Question: How is Excel going to handle loading 60k queries?
- Each query is only a few words to import.
Question: Is it possible to load information from 60k different URLs with one query?
EDIT : thank you very much for answer Alexis, was very helpful. So to avoid copying what you posted I did it without the function (tell me what you think of it) :
let
Source = Json.Document(Web.Contents("https://example.com/all-ID.json")),
items1 = Source[items],
#"Converted to Table" = Table.FromList(items1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "ID"}}),
#"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "URL", each Text.Combine({"http://example.com/api/item/", Text.From([ID], "fr-FR")}), type text),
#"Added Custom" = Table.AddColumn(#"Inserted Merged Column", "Item", each Json.Document(Web.Contents([URL]))),
#"Expanded Item" = Table.ExpandRecordColumn(#"Added Custom", "Item", {"name"}, {"Item.name"})
in
#"Expanded Item"
Now the problem I have is that it takes ages to load up all the information I need from all the URLs.
As it turns out it's possible to extract from multiple IDs at once using this format : http://example.com/api/item/ID1,ID2,ID3,ID4,...,IDN
I presume that trying to load from an URL containing all of the IDs at once would not work out because the URL would contain way too many characters to handle.
So to speed things up, what I'm trying to do now is concatenate every Nth row into one cell, for example with N=3 :
205
651
320165
63156
4645
31
6351
561
561
31
35
would become :
205, 651, 320165
63156, 4645, 31
6351, 561, 561
31, 35
The "Group by" functionnality doesn't seem to be what I'm looking for, and I'm not sure how to automatise that throught Power Query
EDIT 2
So after a lot of testing I found a solution, even though it might not be the most elegant and optimal :
- I created an index with a 1 step
- I created another costum column, I associated every N rows with an N increasing number
- I used "Group By" -> "All Rows" to create a "Count" column
- Created a costum column "[Count][ID]
- Finally I excracted values from that column and put a "," separator
Here's the code for N = 10 000 :
let
Source = Json.Document(Web.Contents("https://example.com/items.json")),
items1 = Source[items],
#"Converted to Table" = Table.FromList(items1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "ID"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"ID", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Added Conditional Column" = Table.AddColumn(#"Added Index", "Custom", each if Number.RoundDown([Index]/10000) = [Index]/10000 then [Index] else Number.IntegerDivide([Index],10000)*10000),
#"Reordered Columns" = Table.ReorderColumns(#"Added Conditional Column",{"Index", "ID", "Custom"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns", {"Custom"}, {{"Count", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom.1", each [Count][ID]),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom.1", each Text.Combine(List.Transform(_, Text.From), ","), type text})
in
#"Extracted Values"