0
votes

I'm using BigQuery. I have a query-scheduler to generate a table (RESULT TABLE) that depends on another table (SOURCE TABLE). The case is, this source table doesn't always have data, there's a possibility that this source table is empty.

I want to Schedule the Query to make the RESULT TABLE only if there's data in SOURCE TABLE.

The example would be:

IF COUNT(1) FROM data.source_table > 0 THEN RUN:

SELECT *
FROM data.source_table
LEFT JOIN data.other_source_table

ELSE [Don't Run]

Thanks in Advance

2
If the answer helped you, consider accepting :) - rmesteves

2 Answers

1
votes

The syntax is

IF condition THEN [sql_statement_list]
[ELSEIF condition THEN sql_statement_list]
[ELSEIF condition THEN sql_statement_list]...
[ELSE  sql_statement_list]
END IF;

So for your case it's

IF COUNT(1) FROM data.source_table > 0 
THEN 

SELECT *
FROM data.source_table
LEFT JOIN data.other_source_table;

END IF;

For more details, you can read https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting#if

1
votes

At the moment you can't set a destination table when using BigQuery Scripting. It means that solutions based on IF statement will not work for your case. Besides that, it seems that when you set a destination table, BigQuery creates the table before your query's execution, which means that independently of the results, the table will be created.

The query below is only SQL. In other words, it doesn't contains scripting. If you use it to create a scheduled query and set a destination table, you will see that even when the sub query is not run an empty table will be created.

SELECT
  *
FROM
  UNNEST( 
    (SELECT
      (
        CASE (SELECT COUNT(1) FROM data.source_table) > 0
        WHEN TRUE 
        THEN ( 
             SELECT ARRAY( 
                           SELECT AS STRUCT *
                           FROM data.source_table
                           LEFT JOIN data.other_source_table)
        )
        END
      )
   )
)

As a workaround, you could keep your existing scheduled query and create another scheduled query just like below to run some minutes after the first one:

IF (SELECT count(1) FROM `dataset.destination_table`) = 0
THEN DROP TABLE `dataset.destination_table`;
END IF

To summarize, your solution would be:

  1. Run a scheduled query that will create a destination table,
  2. A few minutes later, run a scheduled query that will check if the created table is empty. If so, the table will be deleted.

I hope it helps