I'm trying to do some data transformation inside of Big Query, with SQL.
Let's say I have three tables:
Customer - data about the customer, like age, etc Subscriptions - data about what subscriptions the user have Engagements - data about how the customers have interacted with digital products.
I'd like to collect this inside of one table, using nested fields.
I can join all these tables, but I'd like to aggregate them to arrays.
So, instead of three tables I get this:
id:123,
name:David,
age:30,
subscritions: [{
name:sub1
price:10
},
{
name:sub2
price:20
}],
engagment: [{
event:visited_product_x
time:2020-06-10
},
{
event:visited_product_y
time:2020-06-10
}]
Of course I've used array_agg in SELECT. And that works great, when adding only one table. However, when adding another one, I get duplicate rows, which I don't want. So, I guess I shouldn't use array_agg in SELECT, but rather somewhere else.
But whats the best way of solving this?