0
votes

We are actually planning to read the data from CloudSql into BigQuery using CloudDataflow. When i tried to run the below mentioned code, i got the error saying "Driver class name is not provided".

https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/master/src/main/java/com/google/cloud/teleport/templates/JdbcToBigQuery.java

Does anybody know on how to pass the value for this line of code "options.getDriverClassName()"?

1
Your question is what to set in the DriverClassName option? - Panciz
Do you perform a lot of transformation on the data between JDBC and BigQuery? What is your Cloud SQL database engine? - guillaume blaquiere

1 Answers

1
votes

@Panciz, @guillaume blaquiere

I myself found out the solution for this. if you check the below link from Google

https://cloud.google.com/dataflow/docs/guides/templates/provided-batch#java-database-connectivity-jdbc-to-bigquery

We need to pass the parameters that are mentioned in the above link. Since, i was running the "JdbcToBigQuery" Dataflow template from IntelliJ, I passed these parameters as Program Arguments as mentioned below, and it worked.

--project=<google cloud project name>
--stagingLocation=gs://<location>
--gcpTempLocation=gs://<location>
--serviceAccount=<service account for dataflow>
--runner=DirectRunner
--driverJars=gs://<location>/postgres-socket-factory-1.0.15-jar-with-dependencies.jar
--bigQueryLoadingTemporaryDirectory=gs://<location>
--driverClassName=org.postgresql.Driver
--connectionURL=jdbc:postgresql://google/<your google cloud  postgres db name>?cloudSqlInstance=<your google cloud project name>:europe-west1:<your google cloud  postgres instance name>&socketFactory=com.google.cloud.sql.postgres.SocketFactory&useSSL=false
--username=<your username>
--password=<your passsowrd>
--query="<your sql query>"
--outputTable=<your google cloud  project name>:<your google cloud dataset name>.<your google cloud table name>
--connectionProperties=unicode=true&characterEncoding=UTF-8