When attempting to extract data from a BigQuery table requiring a partition filter, the extract job fails.
Here is a simple example creating a table and running an extract job.
package com.example;
import com.google.cloud.bigquery.*;
public class BigQueryExtractTest {
private static final String PROJECT_ID = "my-project-id";
private static final String DATASET_ID = "test_dataset";
private static final String GCS_LOCATION = "gs://my-bucket/path/to/files/part-*";
public static void main(String[] args) throws Exception {
// create BigQuery client
BigQuery bigQuery = BigQueryOptions.newBuilder().setProjectId(PROJECT_ID).build().getService();
// create dataset and table that requires partition filter
bigQuery.create(DatasetInfo.of(DATASET_ID));
bigQuery.query(QueryJobConfiguration.of(
String.format("CREATE TABLE %s.table1 (\n", DATASET_ID) +
"stringColumn STRING,\n" +
"timeColumn TIMESTAMP\n" +
") PARTITION BY DATE(timeColumn)\n" +
"OPTIONS(\n" +
"require_partition_filter=true\n" +
")"));
// extract table
Job job = bigQuery.getTable(TableId.of(DATASET_ID, "table1"))
.extract("NEWLINE_DELIMITED_JSON", GCS_LOCATION)
.waitFor();
// throw exception on error
if (job != null && job.getStatus().getError() != null) {
throw new Exception(job.getStatus().getError().toString());
}
}
}
The code snippet above produces the following error
Exception in thread "main" java.lang.Exception: BigQueryError{reason=invalidQuery, location=query, message=Cannot query over table 'my-project-id.test_dataset.table1' without a filter that can be used for partition elimination}
at com.example.BigQueryExtractTest.main(BigQueryExtractTest.java:34)
The google-cloud-bigquery maven dependency used for this example is shown below.
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-bigquery</artifactId>
<version>1.23.0</version>
</dependency>
The example also threw the exception using the dependency version 0.34.0-beta
How can a partition filter be specified when running an extract job?
require partition filtersetting be toggled on/off? - JonSimbq update --time_partitioning_type=DAY --norequire_partition_filter <table name>. I think you need to specify--time_partitioning_fieldinstead if you're using column-based partitioning though. - Elliott Brossard