1
votes

I am running a Jupyter notebook using the pyspark kernel. I am getting the following error. How can I force Jupyter (ideally from within Jupyter) to use the right driver?

Python in worker has different version 2.6 than that in driver 2.7, PySpark cannot run with different minor versions

Thank you

Hani

1

1 Answers

0
votes

It could be a problem in your pyspark kernel.json configuration. For example my pyspark kernel is located at:

/usr/local/share/jupyter/kernels/pyspark/kernel.json

and contains:

{
 "display_name": "pySpark (Spark 1.6.0)",
 "language": "python",
 "argv": [
  "/usr/local/bin/python2.7",
  "-m",
  "ipykernel",
  "-f",
  "{connection_file}"
 ],
 "env": {
  "PYSPARK_PYTHON": "/usr/local/bin/python2.7",
  "SPARK_HOME": "/usr/lib/spark",
  "PYTHONPATH": "/usr/lib/spark/python/lib/py4j-0.9-src.zip:/usr/lib/spark/python/",
  "PYTHONSTARTUP": "/usr/lib/spark/python/pyspark/shell.py",
  "PYSPARK_SUBMIT_ARGS": "--master yarn-client pyspark-shell"
 }
}

It is very important to point the same python version in both places (argv and PYSPARK_PYTHON).

Hope that helps!