4
votes

I am getting this error: Exception: Java gateway process exited before sending the driver its port number when I try to instantiate a Spark session in Pyspark. Here is the code

from pyspark import SparkConf
from pyspark.sql import SparkSession

if __name__ == '__main__':
    SPARK_CONFIGURATION = SparkConf().setAppName("OPL").setMaster("local[*]")
    SPARK_SESSION = SparkSession.builder\
        .config(conf=SPARK_CONFIGURATION)\
        .getOrCreate()

    print("Hello world")

Here is the traceback

Neon was unexpected at this time.
Traceback (most recent call last):
  File "C:\Users\IBM_ADMIN\Documents\Eclipse Neon for Liberty on Bluemix\OPL_Interface\src\Test\SparkTest.py", line 12, in <module>
    .config(conf=SPARK_CONFIGURATION)\
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\sql\session.py", line 169, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 307, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 115, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 256, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\java_gateway.py", line 95, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number

I am using PyDev with Eclipse Eclipse Neon.2 Release (4.6.2). Here is the configuration: Libraries Environment

Note: I am using the latest Spark release: spark-2.1.0-bin-hadoop2.7

I have checked several other entries Pyspark: Exception: Java gateway process exited before sending the driver its port number Spark + Python - Java gateway process exited before sending the driver its port number? and tried most of the suggested fixes, but the error persists. It's a blocker for me, as I cannot test my code until I can get a SparkSession. BTW, I'm also working with Spark in Java, and I do not have the same issue there.

Is this a bug in Pyspark?

1
I have the same error. Though only in ONE notebook. I can run spark in one notebook and in the other notebook I get the error. Though, both notebooks execute identical code...Sören

1 Answers

0
votes

My coworker and I were also both experiencing this same problem, which was blocking us and had us pulling our hair out for a while. We tried a bunch of suggestions (no spaces in the Java path, setting/unsetting PYSPARK_SUBMIT_ARGS env var, ...) all to no avail.

The thing that fixed it for us was switching to Spark 2.3.1. We were trying with 2.2.1 and 2.3.0.

Hope this helps save some folks a little hair-pulling.