0
votes

I would like to optimize the queries I have in a BigQuery project.

Unhopefully, the code wasn't archived into a source control solution like Git.

Is there a way to construct a diagram of dependencies between all the tables and views? Or at least, given a view or a table, list views and scheduled queries having a reference to it?

Thank you.

1

1 Answers

1
votes

List of BigQuery views containing a specific table name in the view_definition can be queried from INFORMATION_SCHEMA.VIEWS:

SELECT *
FROM region-us.INFORMATION_SCHEMA.VIEWS
WHERE REGEXP_CONTAINS(view_definition, 'your_dataset_name.your_table_name')

-- Update: added dataset-wise if region-wise is not available (kudos to Michel)
SELECT *
FROM your_dataset_name.INFORMATION_SCHEMA.VIEWS
WHERE REGEXP_CONTAINS(view_definition, 'your_dataset_name.your_table_name')

List of BigQuery scheduled queries with their definitions can be downloaded with bq or API and then can be searched for your_dataset_name.your_table_name (for example with jq):

bq ls --transfer_config --transfer_location=US --format=prettyjson | jq '.[] | select(.params.query|test("your_dataset_name.your_table_name"))'