I'm using Apache's Beam
sdk version 0.2.0-incubating-SNAPSHOT
and trying to pull data to a bigtable with the Dataflow
runner. Unfortunately I'm getting NullPointerException
when executing my dataflow pipeline where I'm using BigTableIO.Write
as my sink. Already checked my BigtableOptions
and the parameters are fine, according to my needs.
Basically, I create and in some point of my pipeline I have the step to write the PCollection<KV<ByteString, Iterable<Mutation>>>
to my desired bigtable:
final BigtableOptions.Builder optionsBuilder =
new BigtableOptions.Builder().setProjectId(System.getProperty("PROJECT_ID"))
.setInstanceId(System.getProperty("BT_INSTANCE_ID"));
// do intermediary steps and create PCollection<KV<ByteString, Iterable<Mutation>>>
// to write to bigtable
// modifiedHits is a PCollection<KV<ByteString, Iterable<Mutation>>>
modifiedHits.apply("writting to big table", BigtableIO.write()
.withBigtableOptions(optionsBuilder).withTableId(System.getProperty("BT_TABLENAME")));
p.run();
When executing the pipeline, I got the NullPointerException
, pointing out exactly to the BigtableIO class at the public void processElement(ProcessContext c)
method:
(6e0ccd8407eed08b): java.lang.NullPointerException at org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.processElement(BigtableIO.java:532)
I checked this method is processing all elements before to write on bigtable, but not sure why I'm getting such exception overtime I execute this pipeline. According to the code below, this method uses bigtableWriter
attribute to process each c.element()
, but I can't even set a breakpoint to debug where is exactly the null
. Any kind of advice or suggestion of how to solve this issue?
@ProcessElement
public void processElement(ProcessContext c) throws Exception {
checkForFailures();
Futures.addCallback(
bigtableWriter.writeRecord(c.element()), new WriteExceptionCallback(c.element()));
++recordsWritten;
}
Thanks.