1
votes

I want to write a PCollection to multiple BigQuery tables, with different tables based on the contents of the PCollection and different schemas, the contents of which arrive via a side input.

I noted this in the docs for DynamicDestinations:

An instance of DynamicDestinations can also use side inputs using sideInput(PCollectionView). The side inputs must be present in getSideInputs(). Side inputs are accessed in the global window, so they must be globally windowed.

How is this be practically implemented with v2.0.0 of the Apache Beam BigQueryIO API?

1

1 Answers

2
votes

Supposing you have a side input prepared like

// Must be globally windowed to work with BigQueryIO
PCollectionView<MyAuxData> myView = ...

then you would access it in your DynamicDestinations like this:

new DynamicDestinations<MyElement, MyDestination>() {

  @Override
  protected List<PCollectionView<?>> getSideInputs() {
    return ImmutableLIst.of(myView);
  }

  @Override
  public TableSchema getSchema(MyDestination dest) {
    MyAuxData = sideInput(myView);
    ...
  }

  ...
}

and so on.