I got the same issue on standalone spark in windows.
My version of fix is like this:
I had my environment variables setting as bellow
PYSPARK_SUBMIT_ARGS="pyspark-shell"
PYSPARK_DRIVER_PYTHON=jupyter
PYSPARK_DRIVER_PYTHON_OPTS='notebook' pyspark
With this setting I executed an Action on pyspark and got the following exception:
Python in worker has different version 3.6 than that in driver 3.5, PySpark cannot run with different minor versions.
Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
To check with which python version my spark-worker is using hit the following in the cmd prompt.
python --version
Python 3.6.3
which showed me Python 3.6.3. So clearly my spark-worker is using system python which is v3.6.3.
Now as I set my spark-driver to run jupyter by setting PYSPARK_DRIVER_PYTHON=jupyter so I need to check the python version jupyter is using.
To do this check open Anaconda Prompt and hit
python --version
Python 3.5.X :: Anaconda, Inc.
Here got the jupyter python is using the v3.5.x. You can check this version also in any Notebook (Help->About).
Now I need to update the jupyter python to the version v3.6.6. To do that open up the Anaconda Prompt and hit
conda search python
This will give you a list of available python versions in Anaconda. Install your desired one with
conda install python=3.6.3
Now I have both of the Python installation of same version 3.6.3 Spark should not comply and it didn't when I ran an Action on Spark-driver. Exception is gone.
Happy coding ...