2
votes

As part of a tool I am creating for my team I am connecting to an internal web service via PowerQuery.

The web service returns nested JSON, and I have trouble parsing the JSON data to the format I am looking for. Specifically, I have a problem with extracting the content of records in a column to a comma separated list.

The data

enter image description here

As you can see, the data contains details related to a specific "race" (race_id). What I want to focus on is the information in the driver_codes which is a List of Records. The amount of records varies from 0 to 4 and each record is structured as id: 50000 (50000 could be any 5 digit number). So it could be:

id: 10000 id: 20000 id: 30000

As requested, an example snippet of the raw JSON:

<race>
    <race_id>ABC123445</race_id>
    <begin_time>2018-03-23T00:00:00Z</begin_time>
    <vehicle_id>gokart_11</vehicle_id>
    <driver_code>
        <id>90200</id>
    </driver_code> 
    <driver_code>
        <id>90500</id>
    </driver_code>
</race>

I want it to be structured as:

10000,20000,30000

The problem

When I choose "Extract values" on the column with the list, then I get the following message:

Expression.Error: We cannot convert a value of type Record to type Text.

If I instead choose "Expand to new rows", then duplicate rows are created for each unique driver code. I now have several rows per unique race_id, but what I wanted was one row per unique race_id and a concatenated list of driver codes.

What I have tried

I have tried grouping the data by the race_id, but the operations allowed when grouping data do not include concatenating rows.

I have also tried unpivoting the column, but that leaves me with the same problem: I still get multiple rows.

I have googled (and Stack Overflowed) this issue extensively without luck. It might be that I am using the wrong keywords, however, so I apologize if a duplicate exists.

UPDATE: What I have tried based on the answers so far

I tried Alexis Olson's excellent and very detailed method, but I end up with the following error:

Expression.Error: We cannot convert the value "id" to type Number. Details:

Value=id Type=Type

The error comes from using either of these lines of M code (one with a List.Transform and one without):

= Table.Group(#"Renamed Columns", {"race_id", "begin_time", "vehicle_id"},
 {{"DriverCodes", each Text.Combine([driver_code][id], ","), type text}})
= Table.Group(#"Renamed Columns", {"race_id", "begin_time", "vehicle_id"},
 {{"DriverCodes", each Text.Combine(List.Transform([driver_code][id], each Number.ToText(_)), ","), type text}})

NB: if I do not write [driver_code][id] but only [id] then I get another error saying that column [id] does not exist.

2
I'm having difficulty reproducing your error. Can you provide a sample JSON string you're trying to do this with?Alexis Olson
Added. Please see my edit. Does that make it clearer for you?Grobsrop
Sure. That's XML, not JSON though.Alexis Olson

2 Answers

4
votes

Here's the JSON equivalent to the XML example you gave:

{"race": {
    "race_id": "ABC123445",
    "begin_time": "2018-03-23T00:00:00Z",
    "vehicle_id": "gokart_11",
    "driver_code": [
      { "id": "90200" },
      { "id": "90500" }
    ]}}

If you load this into the query editor, convert it to a table, and expand out the Value record, you'll have a table that looks like this:

Start Table

At this point, choose Expand to New Rows, and then expand the id column so that your table looks like this:

Intermediate Table

At this point, you can apply the trick @mccard suggested. Group by the first columns and aggregate over the last using, say, max.

Group By

This last step produces M code like this:

= Table.Group(#"Expanded driver_code1",
              {"Name", "race_id", "begin_time", "vehicle_id"},
              {{"id", each List.Max([id]), type text}})

Instead of this, you want to replace List.Max with Text.Combine as follows:

= Table.Group(#"Changed Type",
              {"Name", "race_id", "begin_time", "vehicle_id"},
              {{"id", each Text.Combine([id], ","), type text}})

Note that if your id column is not in the text format, then this will throw an error. To fix this, insert a step before you group rows using Transform Tab > Data Type: Text to convert the type. Another options is to use List.Transform inside your Text.Combine like this:

Text.Combine(List.Transform([id], each Number.ToText(_)), ",")

Either way, you should end up with this:

Final Table

1
votes

An approach would be to use the Advanced Editor and change the operation done when grouping the data directly there in the code.

First, create the grouping using one of the operations available in the menu. For instance, create a column"Sum" using the Sum operation. It will give an error, but we should get the starting code to work on.

Then, open the Advanced Editor and find the code corresponding to the operation. It should be something like:

{{"Sum", each List.Sum([driver_codes]), type text}}

Change it to:

{{"driver_codes", each Text.Combine([driver_codes], ","), type text}}