0
votes

I am writing a bigquery code to split a JSON data set into a more structured table.

The JSON_data_set looks something like this

Row | createdon | result
1   | 24022020  | {"searchResult": {"searchAccounts": [{"chainName": "xyxvjw", "address": {"name": "xyxvjw - ythji", "combined_city": "uptown", "combined_address": "1 downtown, uptown, 09728", "city": "uptown"}, "products": ["pin", "needle", "cloth"]}},{"chainName": "pwiewhds", "address": {"name": "pwiewhds - oujsus", "combined_city": "over the river", "combined_address": "100 under bridge, over the river, 19920", "city": "over the river"}, "products": ["tape", "stapler"]}}],"searchID": "3abci832832o0"}}
2   | 25020202  | {"searchResult": {"searchAccounts": [{"chainName": "xyxvjw2029", "address": {"name": "xyxvjw2029 - ythji", "combined_city": "uptown", "combined_address": "1 downtown, uptown, 09728", "city": "uptown"}, "products": ["pin", "needle", "cloth"]}},{"chainName": "pwiewhds8972", "address": {"name": "pwiewhds8972 - oujsus", "combined_city": "over the river", "combined_address": "100 under bridge, over the river, 19920", "city": "over the river"}, "products": ["tape", "stapler"]}}],"searchID": "3abci832832o0"}}

There are many subsequent account details in each row in the result column. Able to unnest the data using the below code to get column data such as chain name & address. However, when I try to call the broken down field columns, it gives me the error Cannot access field _field_1 on a value with type ARRAY-STRUCT-STRING, STRING>>

How can I separate the columns created from json data into individual columns and rows without being tied to the json row column?

CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
   if (json !== null) {
    return JSON.parse(json).map(x=>JSON.stringify(x));
   }
"""; 

SELECT * EXCEPT(chains),
  ARRAY(SELECT AS STRUCT JSON_EXTRACT_SCALAR(x, '$.chainName'), JSON_EXTRACT_SCALAR(x, '$.address.combined_address') FROM UNNEST(chains) x WHERE JSON_EXTRACT_SCALAR(x, '$.chainName') IS NOT NULL) chain_names
FROM (
  SELECT *,
     json2array(
      JSON_EXTRACT(result, '$.searchResult.searchAccounts')
     ) chains
  FROM json_data_set
)
1

1 Answers

1
votes

Just needed to write the query a different way to achieve individual columns

CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
   if (json !== null) {
    return JSON.parse(json).map(x=>JSON.stringify(x));
   }
"""; 

WITH chain_name AS (
  SELECT 
     *,
     json2array(
      JSON_EXTRACT(result, '$.searchResult.searchMerchants')
     ) chains
  FROM json_data_set
)
SELECT AS STRUCT 
   JSON_EXTRACT_SCALAR(x, '$.chainName') chainName, 
   JSON_EXTRACT_SCALAR(x, '$.address.combined_address') combined_address
FROM chain_name, UNNEST(chains) x
WHERE JSON_EXTRACT_SCALAR(x, '$.chainName') IS NOT NULL