1
votes

I've got a table in Google BigQuery which has repeated records in it, I've followed the guide at https://cloud.google.com/bigquery/docs/nested-repeated to create the table successfully and I've populated the table with some test data using

INSERT INTO `<project>.<dataset>.<table>` (<list of fields, ending with repeated record name>)
VALUES
    (
        "string1", false, 200.0, "string2", 0.2, 2.345, false, "2020-01-02 12:34:56",
        [
            ("repeated field str1", CAST(2.01 AS FLOAT64), CAST(201 as NUMERIC), false),
            ("repeated field str2", CAST(4.01 AS FLOAT64), CAST(702 as NUMERIC), true)
        ]
    );

(etc) And the table is successfully populated, also I can query the data with

select * from <dataset>.<table>

and all fields, repeated and non-repeated, are returned. I can also successfully query the non-repeated fields from the table, as long as no repeated fields are specified in the query. However when I want to include specific repeated fields in the query (and I'm following the guide at https://cloud.google.com/bigquery/docs/legacy-nested-repeated) for example

SELECT normalfield1, normalfield2, normalfield3, 
repeatedData.field1, repeatedData.field2, repeatedData.field3
FROM `profile_dataset.profile_betdatamultiples`;

I get error

Cannot access field <field name> on a value with type ARRAY<STRUCT<fieldname1 STRING, fieldname2 FLOAT64, fieldname3 NUMERIC, ...>> at [8:14]"

(annoyingly GCP truncates the error message so I can't see all of it)

Are there any suggestions for how to proceed here?

Thanks!

2

2 Answers

2
votes

Below is for BigQuery Standard SQL

#standardSQL
SELECT normalfield1, normalfield2, normalfield3, 
  data.field1, data.field2, data.field3
FROM `project.profile_dataset.profile_betdatamultiples`,
UNNEST(repeatedData) data   

If to apply to sample data in your question

enter image description here

output is

enter image description here

1
votes

I recreated the table with this code:

CREATE TABLE `temp.experiment` AS
SELECT "string1" s1, false b, 200.0 i1, "string2" s2, 0.2 f1, 2.345 f2, false b2, TIMESTAMP("2020-01-02 12:34:56") t1,
  [
    STRUCT ("repeated field str1" AS s1, CAST(2.01 AS FLOAT64) AS f2, CAST(201 as NUMERIC) AS n1, false AS b), 
    STRUCT ("repeated field str2", CAST(4.01 AS FLOAT64), CAST(702 as NUMERIC), true)
  ] AS b1

Now I can query particular nested rows like this:

SELECT s1, b, s2
 ,  b1[OFFSET(0)].s1 AS arr_s1, b1[OFFSET(0)].f2, b1[OFFSET(0)].n1
FROM  `temp.experiment`

You might want to UNNEST instead of [OFFSET(0)], but the question doesn't say what results you are expecting.