0
votes

I'm currently trying to create my own analytics panel with the data I get off of BigQuery. I have all my data coming in correctly and am able to query each table individually or every table in a specific dataset but not query every dataset within a project.

To put it simply, I want to query every single table I have within BigQuery at once. The tables within BigQuery are being populated by Firebase Analytics and are likely to change without notice, add or remove one.

I'm aware of the method where you JOIN each table within a query but the values are hard coded. I need a way where I can provide a wildcard and automatically query every table.

Any help at all would be great, thanks!

2

2 Answers

6
votes

Unfortunately, you can't write a single query that can query all tables in all datasets without knowing the dataset names beforehand.

However, if you can programatically construct your query, you can use BigQuery's datasets.list API to get all the dataset names, and then construct a query that will get all tables within those datasets using table wildcards as described above.

1
votes

Have you seen the documentation on wildcard tables? The example it gives is:

#standardSQL
SELECT
  max,
  ROUND((max-32)*5/9,1) celsius,
  mo,
  da,
  year
FROM
  `bigquery-public-data.noaa_gsod.gsod19*`
WHERE
  max != 9999.9 # code for missing data
  AND _TABLE_SUFFIX BETWEEN '29'
  AND '40'
ORDER BY
  max DESC;

Wildcard tables apply at the table level, though--not the dataset level--so you would still need to perform a union between the tables from all of your datasets, e.g.:

SELECT *
FROM (
  SELECT * FROM `first-dataset.*` UNION ALL
  SELECT * FROM `second-dataset.*` UNION ALL
  SELECT * FROM ...
);