1
votes

I am trying to connect to vertica from spark. Below is my code:

val opt = Map("host" -> host , "table" -> table , "db" -> db , "numPartitions" -> partitions , "user" -> user , "password" -> pswd )
val df1 = sqlContext.read.format("com.vertica.spark.datasource.DefaultSource").options(opt).load()
df1.show()

I am getting below error-

Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: com.vertica.spark.datasource.DefaultSource. Please find packages at http://spark-packages.org at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at com.ConnectToVertica$.main(ConnectToVertica.scala:32) at com.ConnectToVertica.main(ConnectToVertica.scala)

I checked for the packages also (http://spark-packages.org) as indicated in the error but did not find any package for Vertica. If I execute the same code with spark-submit by passing the jars for Vertica its working fine but directly from IDE gives me this error.I tried with spark 1.6.2 also and getting the same error.

1

1 Answers

0
votes

Looks like you have not added the jar file in your class path. Download the jar from the below url, add to the classpath and try the same.

https://www.qzhou.com.cn/user/bdy/3477137749 

I was the facing the same issue sometimes back where i was unable to find the hpe-spark connector. Hopes this will help