0
votes

I am trying to install Apache Spark on my Windows 10 computer. My first step to install Spark was to download Java from here it was then downloaded through this path: C:\Program Files (x86)\Java the one folder that was downloaded during this installation was: \jre1.8.0_151

Next I installed the JDK from here and downloaded the windows x86 version. I used the same path as above in the Java folder to download it. After it was done I had two folders inside the Java folder: jdk1.8.0_151 and jre1.8.0_151

Afterwards, I set the JAVA_HOME variable to point to C:\PROGRA~1(x86)\Java\jdk1.8.0_151 and in Path I added %JAVA_HOME% I then installed Scala from here and downloaded the Scala binaries for Windows. The environment path was added in path as: C:\PROGRA~1(x86)\scala\bin

Next I installed Spark from here and installed spark-2.2.1-bin-hadoop2.7.tgz Afterwards, I placed this folder here D:\spark-2.2.1-bin-hadoop2.7I then added the environment variable as SPARK_HOME with the path: D:\spark-2.2.1-bin-hadoop2.7\bin and I then updated path to %SPARK_HOME%\bin

Finally I tried to see if everything was installed. I typed java -version and the correct java was installed. I then typed scala and scala was open for me to type in expressions and such. I then typed in spark-shell but I got this error:

'spark-shell' is not recognized as an internal or external command, operable program or batch file.

What am I doing wrong that is not making spark open? Please note: I am using cmd for everything.

1

1 Answers

2
votes

It looks like you set your %SPARK_HOME% to a wrong place and thus when "I then updated path to %SPARK_HOME%\bin" it resulted in adding D:\spark-2.2.1-bin-hadoop2.7\bin\bin with double \bin which is obviously wrong. %SPARK_HOME% should be without \bin.

Generally you can test your environment variables by calling echo %PATH% in the command line or SET to show all of them