0
votes

I found this Bigtable with Dataflow example https://github.com/GoogleCloudPlatform/cloud-bigtable-examples/blob/master/java/dataflow-connector-examples/src/main/java/com/google/cloud/bigtable/dataflow/example/HelloWorldWrite.java

However; it uses

beam-runners-google-cloud-dataflow-java 2.4.0

and in 2.9.0 org.apache.beam.runners.dataflow.options.DataflowPipelineOptions is no longer there.

Is there an up to date example of writing to BigTable from Dataflow?

I found: https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.html - is that on the right track?

1
Why not modify the example to support the latest SDK? In the process you will learn a few good details about how to write Dataflow programs. The changes required are minor.John Hanley
I'll update the examples.Solomon Duskis

1 Answers

0
votes

I have used the BigTable connector example you link to, following the instructions here, with Dataflow Java SDK 2.9.0 and it works fine. The only extra step needed is to change the SDK in the pom.xml file (line):

Replace <beam.version>2.4.0</beam.version> for <beam.version>2.9.0</beam.version>.

The Dataflow job will start (you'll see Dataflow SDK version: 2.9.0 in std output). Once it succeeds you can verify in the HBase shell that the correct rows were written:

hbase(main):001:0> scan 'Dataflow_test'
ROW                                                                              COLUMN+CELL
 Hello                                                                           column=cf:qualifier, timestamp=1548151071821, value=value_21.60451762361535
 World                                                                           column=cf:qualifier, timestamp=1548151064955, value=value_21.60451762361535
2 row(s) in 1.4230 seconds