I'm fairly new to BigQuery (3rd day of using it with no training), I'm just trying to get my head around nested fields etc.
I've looked at the following resources and used the personsdata example on the google bigquery docs link
https://cloud.google.com/bigquery/docs/data
I'd like to run the below query:
select *
from [dataset.tableid]
where fullname = 'John Doe'
If I run this, I get the following error:
Error: Cannot output multiple independently repeated fields at the same time. Found children_age and citiesLived_place
From reading the above articles this isn't possible because you need to flatten the results, which from what I can understand just duplicates all the none repeated variables i.e.
Fullname | age | gender | Children.name | children.age
John Doe | 22 | Male | John | 5
John Doe | 22 | Male | Jane | 7
One of the above articles suggests that you can still use the where statements by using the flatten function in bigquery:
select fullname,
age,
gender,
citiesLived.place
FROM (FLATTEN([dataset.tableId], children))
WHERE
(citiesLived.yearLived > 1995) AND
(children.age > 3)
GROUP BY fullName, age, gender, citiesLived.place
If I change this to:
select *
FROM (FLATTEN([dataset.tableId], children))
WHERE fullname = 'John Doe'
Then this works fine and gives me what I need however if I change to this:
select *
FROM (FLATTEN([dataset.tableId], citieslived))
WHERE fullname = 'John Doe'
Then I get the following error:
Error: Cannot output multiple independently repeated fields at the same time. Found children_age and citiesLived_yearsLived
Can someone explain why this will work flattening based on "Children" but not "CitiesLived" and how to know what variables to use within flatten with more complex datasets with multiple nested variables?
Thank you in advance