How to import CSV file into Cloud Bigtable via Cloud Dataflow with Python?

Question

The easiest way to describe what I'm doing is essentially to follow this tutorial: Import a CSV file into a Cloud Bigtable table, but in the section where they start the Dataflow job, they use Java:

mvn package exec:exec \
    -DCsvImport \
    -Dbigtable.projectID=YOUR_PROJECT_ID \
    -Dbigtable.instanceID=YOUR_INSTANCE_ID \
    -Dbigtable.table="YOUR_TABLE_ID" \
    -DinputFile="YOUR_FILE" \
    -Dheaders="YOUR_HEADERS"

Is there a way to do this particular step in python? The closest I could find was the apache_beam.examples.wordcount example here, but ultimately I'd like to see some code where I can add some customization into the Dataflow job using Python.

Tracy Cui Tracy Cui · Accepted Answer · 2019-03-13T22:22:27

There is a connector for writing to Cloud Bigtable, which you can use as a starting point for importing CSV files.

How to import CSV file into Cloud Bigtable via Cloud Dataflow with Python?

3 Answers