Could you guide me to upgrade my spark version in my local machine. I would want to run on hadoop 2.7 with hive 1.2.1(metastore in mysql).
I was using the old spark version 1.5 and would like to upgrade to newer one 2.0. I have downloaded the binary file 'spark-2.0.0-bin-hadoop2.7.tgz' and tar'ed it.
I have added in the spark-env - HADOOP_HOME, HADOOP_CONF_DIR. SPARK_CLASSPATH points to mysql-connector jar file. In spark-default added spark.sql.warehouse.dir, spark.sql.hive.metastore.version and spark.sql.hive.metastore.jars.
I have modified the .bashrc file. when I start hive I get the below message:
cannot access /opt/spark-2.0.0-bin-hadoop2.7/lib/spark-assembly-*.jar:
No such file or directory
I did not build the spark since its binary version. However, my older version has spark-assembly jar file but could not the same in spark2.0 jar directory. Do I need to have this jar file?
I have copied `hive-site.xml to the conf directory. Also, running sql query in pyspark throws the below error:
Database at /home/revathy/metastore_db has an incompatible
format with the current version of the software. The database
was created by or upgraded by version 10.11.
My metastore version is 1.2.1(and is specified in spark-default)
Could not find details on connecting hive metastore on spark 2.0
Could someone help. From pyspark, I was able to read a file in hadoop and Hive is working fine(checked in cli).
Please provide link/details on configuring hive metastore(mysql) on spark.