Objective
I'm trying to write to Oracle's ADWC (basically oracle database) from a Spark application running on Yarn. The only way to connect to this database is by using an Oracle Wallet file, which is basically a Java keystore.
Problem
The problem arises when the JDBC driver tries to read the wallet from HDFS. If I include the hdfs://
prefix the parser in the JDBC driver throws an error and if I don't then it cannot find the file.
Previous Attempts
- including the directory in the connect string (prefixed and non)
jdbc:oracle:thin:@luigi_low?TNS_ADMIN=/user/spark/wallet_LUIGI
- including the directory as an
spark.driver.extraJavaOptions
with-Doracle.net.tns_admin
and-Doracle.net.wallet_location
All the code is on GitHub, and specifically, the error messages are here https://github.com/sblack4/kafka-scala-jdbc/blob/master/ERROR.md
I've got a working example of the same connection here https://github.com/sblack4/scala-jdbc-adwc
help me StackOverflow. you are my only hope
If you need any more clarification don't hesitate :)
update (SparkFiles
attempt)
the code is on a separate branch of the same repository, https://github.com/sblack4/kafka-scala-jdbc/tree/sparkfiles
This error message mystifies me as it seems my JDBC library has stopped trying to read the wallet files. It may be unrelated to the previous problem
Exception in thread "main" java.sql.SQLRecoverableException: IO Error: Invalid connection string format, a valid format is: "host:port:sid"
I've deleted the other JDBC libraries from my classpath through Ambari as this error could be related to spark picking up an older version of my JDBC library
InputStream
?) So, why don't you use Spark to download the wallet to a local (temp) file (or maybe open a stream on HDFS ?) and pass that to the driver?? - Samson ScharfrichterInputStream
idea but am not sure how I'd pursue downloading the driver locally. That sounds fairly simple if I can then pass that driver-local path to the JDBC in the connection string - Steven Black