Adding JDBC driver to AWS Glue for existing Spark code

Question

I am trying to run existing Spark (Scala) code on AWS Glue.

This code uses spark.read.option("jdbc") and I have been adding the JDBC driver to the Spark classpath with the spark.driver.extraClassPath option.

This works fine locally as well as on EMR, assuming I can copy the driver from S3 to the instances first with a bootstrap action.

But what's the equivalent on Glue? If I add the driver to the "dependent JARs" option, it doesn't work and I get the "no suitable driver" error, presumably because the JAR must be visible to Spark's own classloader.

Lamanus Lamanus · Accepted Answer · 2020-08-24T12:19:25

Edit your job and at the end of the screen, you can see the libraries option.

And some options are needed, see the last part of the documentation.

Adding JDBC driver to AWS Glue for existing Spark code

1 Answers