0
votes

I am trying to run Spark after installing but the command "spark-shell" gives the error:

Could not find or load main class version.

I tried to fix this by setting my JAVA_HOME in various (perhaps contradictory) ways. I also set SCALA_HOME and edited spark-env.sh. What steps may I take to fix this?

Similar to:

This question2 (I am using Ubuntu 20.04, the above question is for Windows and this question is about spark-submit not spark-shell command) and this question:(this error is different to mine but similar)

Version Information: I am working on Ubuntu 20.04 Hadoop version: 2.10.0 Spark version: spark-2.4.5-bin-without-hadoop-scala-2.12 Scala version: 2.11.12 (previously I tried scala 2.12 as I thought this compatible) Java version: openjdk version 1.8.0_252, runtime: build 1.8.0_252-8u252-b09-1ubuntu1-b09 openJDK 64-Bit Server VM (build 25.252-b09, mixed mode) javac 1.8.0_252

Details of steps I have taken:

I have installed Hadoop (extracted program files to usr/hadoop, configured namenode and datanode, set javapath), Java 1.8 and scala. Hadoop works fine. I can see namenode in my browser and hadoop jobs.

I have installed Spark (extracted program files to usr/Spark).

In spark-env.sh I have set:

export HADOOP_CONF_DIR=/home/sperling/hadoop/hadoop-2.10.0,
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre

In bashrc I have set:

export SCALA_HOME=/usr/share/scala export HADOOP_HOME=/home/sperling/hadoop/hadoop-2.10.0 export SPARK_HOME=/home/sperling/spark export PATH=$PATH:/home/sperling/spark/bin

In etc/environment I have set:

JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

I do not know what to try next as it seems that Spark cant find either Java or Scala yet they show up when I type echo $JAVA_HOME and echo $SCALA_HOME in terminal.

1

1 Answers

0
votes

Spark version: spark-2.4.5-bin-without-hadoop-scala-2.12

It means that Spark is pre-built with or expecting Scala 2.12 I don't think you'd be able to run it with Scala 2.11

From what I can see at Spark compatibility page, Spark 2.4.5 is provided pre-built with either Hadoop 2.7.x or with Hadoop 3.2.x

https://spark.apache.org/downloads.html

I would suggest to either try it with one of the versions of Hadoop they recommend or install Hadoop 3.2.x+