0
votes

i have a strange error, i am trying to write data to hive, it works well in spark-shell, but while i am using spark-submit, it throwing database/table not found in default error.

Following is the coding i am trying to write in spark-submit , i am using custom build of spark 2.0.0

 val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext.table("spark_schema.iris_ori")

Following is the command i am using,

/home/ec2-user/Spark_Source_Code/spark/bin/spark-submit --class TreeClassifiersModels --master local[*] /home/ec2-user/Spark_Snapshots/Spark_2.6/TreeClassifiersModels/target/scala-2.11/treeclassifiersmodels_2.11-1.0.3.jar /user/ec2-user/Input_Files/defPath/iris_spark SPECIES~LBL+PETAL_LENGTH+PETAL_WIDTH RAN_FOREST 0.7 123 12

Following is the Error,

16/05/20 09:05:18 INFO SparkSqlParser: Parsing command: spark_schema.measures_20160520090502 Exception in thread "main" org.apache.spark.sql.AnalysisException: Database 'spark_schema' does not exist; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:37) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:195) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:360) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:464) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:458) at TreeClassifiersModels$.main(TreeClassifiersModels.scala:71) at TreeClassifiersModels.main(TreeClassifiersModels.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:726) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:183) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:208) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:122) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

1
could you paste the error mesages, hive-site.xml and spark-submit commad?user1314742
hi @user1314742 added the command and Error messageSam
in the code you provided, I could not see where you use the database iaw_model_summary which cause the problemuser1314742
sorry its spark_schema only, iaw_model_summary is another db, it also not working. its an alternate to spark_schema. i changed the schema name and tested whether its working or not.Sam
where are your databases saved? are you sure they exist before calling them?user1314742

1 Answers

2
votes

The issue was because of the deprecation happened on Spark Version 2.0.0. Hive Context was deprecated in Spark 2.0.0. To read/Write Hive tables on Spark 2.0.0 we need to use Spark session as follows.

val sparkSession = SparkSession.withHiveSupport(sc)