0
votes

BigQuery Quota and Limits suggests: Maximum number of tables referenced per query = 1000 and max columns = 10,000.

But, what are the max joins that a query can handle? Also in that same vein, what is the Optimal number of joins after which BQ's performance starts to degrade, despite more compute power is added?

Any benchmarks?

1
The situation with the target state Schema on BQ is lots of normalized tables which will not be materialized as data needs to be as current as possible. Behind the scenes the Query requests generated by BI Tool like Tableau, will have a high number of table-joins in the range of 30 tables to 120 tables depending on the queries fired. Hence wanted to know any upper limit on table-joins that BQ starts to show a performance degradation.Mickaroo
The scenario you described seems to imply that there could be quite some smaller dimension table joining with a few fact table. If this is the case, each join slows the query down a bit (for sure) but I don't see there could be a cliff when the performance will drop dramatically at some point.Yun Zhang

1 Answers

0
votes

It really depends on your use case, the size of your data, the types of joins, etc. BQ is be able to handle a reasonable amount of joins, but you can push the limits.

If you are doing ad-hoc queries, it will be hard (but not impossible) to overload a query with joins.

If you are doing ETL/ELT, it is often helpful to materialize intermediate tables to reduce the # of joins per query.

If you are using BQ as a backend for a BI tool, you will probably want to pre-join and materialize tables to avoid joins at run-time.

Either way, i find it good practice to use CTEs to reduce data before joining whenever possible.