0
votes

I'm using the Python SDK for apache beam. Im attempting to read data from BigQuery via a ValueProvider (as the documentation states that these are allowed).

def run(bq_source_table: ValueProvider,
        pipeline_options=None):

    pipeline_options.view_as(SetupOptions).setup_file = "./setup.py"

    with beam.Pipeline(options=pipeline_options) as pipeline:
        (
            pipeline
            | "Read from BigQuery" >> ReadFromBigQuery(table=bq_source_table)
        )

The options are declared as follows:

class CPipelineOptions(PipelineOptions):

    @classmethod
    def _add_argparse_args(cls, parser):
        parser.add_value_provider_argument(
            "--bq_source_table",
            help="The BigQuery source table name..\n"
                 '"<project>:<dataset>.<table>".'
        )

Executing the pipeline yields the error below:

AttributeError: 'StaticValueProvider' object has no attribute 'projectId' [while running 'Read from BigQuery/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/SplitAndSizeRestriction']

Any suggestions on how to resolve this? I do not want to use Flex Templates.


EDIT: Good thing to mention is that the query param does support the ValueProvider. Could this be a bug?

1

1 Answers

0
votes

My only suggestion would be to use Flex Templates.