0
votes

I am having so much trouble with querying from Big Query's nested schema. I have the following fields.

enter image description here

I want to flatten the table and get something like this.

user | question_id | user_choices

123 | 1 | 1

123 | 1 | 2

123 | 1 | 3

123 | 1 | 4

From other resources, I got to a point where I can query from one of the records in the repeated columns. Such as the following:

SELECT user, dat.question_id FROM tablename, UNNEST(data) dat

It gives me this result. enter image description here

But when I do this, I get another repeated columns again.

SELECT user, dat.question_id, dat.user_choices FROM tablename, UNNEST(data) dat

enter image description here

Can anyone help me how to UNNEST this table properly so I can have flattened schema for all data items?

Thanks!

1

1 Answers

5
votes

Below is for BigQuery Standard SQL

#standardSQL
SELECT user, question_id, choice 
FROM `project.dataset.table`, 
  UNNEST(data) question, 
  UNNEST(user_choices) choice

You can test, play with above using dummy data from your question like below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 user, 
    [STRUCT<question_id INT64, user_choices ARRAY<INT64>>
      (1,[1,2,3]),
      (2,[2,5]),
      (3,[1,3])
    ] data UNION ALL
  SELECT 2 user, 
    [STRUCT<question_id INT64, user_choices ARRAY<INT64>>
      (1,[2,3]),
      (2,[4,5]),
      (3,[2,6])
    ] data
)
SELECT user, question_id, choice 
FROM `project.dataset.table`, 
  UNNEST(data) question, 
  UNNEST(user_choices) choice
ORDER BY user, question_id, choice    

with result

Row user    question_id choice   
1   1       1           1    
2   1       1           2    
3   1       1           3    
4   1       2           2    
5   1       2           5    
6   1       3           1    
7   1       3           3    
8   2       1           2    
9   2       1           3    
10  2       2           4    
11  2       2           5    
12  2       3           2    
13  2       3           6