I have following schema in Mysql(simplified for this question. In real it contains more tables than given here)
User: id, email, first_name, last_name, gender, birthday
and 30 more such columns
Account: id, user_id, total_visits, total_credits, total_redemptions, total_debits, points, initial_credit, initial_debit
& 20 more such columns
Checkin: id, user_id, location_id, approved, amount, number, checkin_date, status, qr_code, barcode, points_earned
and 30more such columns.
Here
- id - primary key. integer
- table_id - foreign key. for example user_id in accounts, table points to user's id col in User table.
To import this, advice in the docs, is:
In BigQuery, you typically want to denormalize the data structure in order to enable super-fast querying. While JOINs on small datasets are possible with BigQuery, they're not as performant as a denormalized structure.Some type of normalization is possible with the nested/repeated functionality.
If I understand this, does that mean:
- there will only table: User with 100+columsn(data from all these tables(Account, Checkin etc)
- there will be One User table and One Events tables. User datable will have pretty same schema it currently has in mysql. events table will store actual data checkins, account.
- Some Other type of schema?
Further, we can I find more resources that go indepth on denormalising mysql tables for Bigquery?