0
votes

This seems like a common problem with trying to use Spark on Windows from the research I've done so far, and usually has to do with the PATH being set incorrectly. I've triple checked the PATH however, and tried out many of the solutions I've come across online, and I still can't figure out what's causing the problem.

  1. Trying to run spark-shell from the command prompt in Windows 7 (64-bit) returns The system cannot find the path specified.

    "Cannot find the specified path" screenshot

  2. However I can run that same command from within the directory where the spark-shell.exe is located (albeit with some errors), which leads me to believe that this is a PATH issue like most of the other posts regarding this problem on the internet. However...

    Spark-shell works when called from directory:

    Spark-shell works when called from directory

    Shell appears to be working:

    Shell appears to be working

  3. From what I can tell, my PATH appears to be set correctly. Most of the solutions for this issue that I've come across involve fixing the %JAVA_HOME% system variable to point to the correct location and adding '%JAVA_HOME%/bin' to the PATH (along with the directory holding 'spark-shell.exe'), however both my JAVA_HOME variable and PATH variable appear to contain the required directories.

    Screenshot of PATHs

1

1 Answers

2
votes

Turns out this issue was being caused by a previously installed version of Spark on my computer. PySpark was already installed via "pip install PySpark" when I tried installing the standalone Spark client, and with two instances of Spark installed, running "spark-shell" created a conflict when both of these locations were being referenced.

So even though the PATH was set correctly, the fact that "spark-shell" was referencing a previous PySpark install AND a standalone Spark install was creating the issue.

I noticed when I ran "pyspark" from the command line, it returned two instances of "The system could not find the path specified." error.. which lead me to believe that pyspark/spark was installed in two locations, and could potentially cause an issue with resolving the PATH when I called "spark-shell".

I ran "pip uninstall pyspark", and then when I re-tried "spark-shell" from the command line it worked as expected!