I'm looking into moving my analytics system across to BigQuery and I was wondering if someone could give me some pointers on the best way to structure my data.
I am logging analytics events that happen on a website. Each analytics event type must be stored in a separate table as they don't have similar fields.
I will be using partitions to separate by day, and running a query at midnight everyday on that days data to generate total counts which can then be stored elsewhere for faster querying for the analytics platform.
I create a dataset for each company, named such as company_123, company_124 etc.
Then within each dataset is the analytics for all the websites the company manages, with table names such as user_click_event_website_id, user_page_load_website_id etc.
I need to be able to query both 'all user clicks on website X' and 'all user clicks across all their websites'.
My question is, would it be better (cost & query time) to split my tables by website_id and run
select * from user_click_* and select * from user_click_website_x
or put them all in one table with a column for website_id and run
select * from user_click where website_id = 'website_x' and select * from user_click
Thanks