0
votes

My objective is to read the data from BigQuery table and write it to Avro file on cloud storage using Java. It would be good if some one provide the code snipet/ideas to write BigQuery table data and write it to avro format data using Cloud Dataflow.

1

1 Answers

2
votes

It is possible to export data from BigQuery to GCS as Avro format as a one-time export, this can be done through the Client Libraries, including Java. Here are some snippets (the full example can be found in GitHub), and for java you can code:

Job job = table.extract(format, gcsUrl);
// Wait for the job to complete
try {
  Job completedJob =
      job.waitFor(
          RetryOption.initialRetryDelay(Duration.ofSeconds(1)),
          RetryOption.totalTimeout(Duration.ofMinutes(3)));
  if (completedJob != null && completedJob.getStatus().getError() == null) {
    // Job completed successfully
  } else {
    // Handle error case
  }
} catch (InterruptedException e) {
  // Handle interrupted wait
}

The format variable can be CSV, JSON or AVRO and the gcsUtl variable should contain bucket and path to the file, e.g. gs://my_bucket/filename