2
votes

I am trying to connect to Teradata and DB2 from Pyspark.

I am using the below jars :

tdgssconfig-15.10.00.14.jar
teradata-connector-1.4.1.jar
terajdbc4-15.10.00.14.jar & db2jcc4.jar

Connection string:

df1 = sqlContext.load(source="jdbc", driver="com.teradata.jdbc.TeraDriver", url=db_url,user="db_user",TMODE="TERA",password="db_pwd",dbtable="U114473.EMPLOYEE")

df = sqlContext.read.format('jdbc').options(url='jdbc:db2://10.123.321.9:50000/DB599641',user='******',password='*****',driver='com.ibm.db2.jcc.DB2Driver', dbtable='DSN1.EMPLOYEE')

Both gives me Driver not found error.

Can we use JDBC drivers for pyspark?

1
when you say "you are using the below jars", are you meaning you're starting your your session using something like `pyspark --jars /path/to/jar1,/path/to/jar2 ?James Tobin

1 Answers

0
votes

Like James Tobin said, use the pyspark2 --jars /jarpath option when you start your pyspark sessioni or when you submit your py to spark