To reproduce, use simplest sparkling water Python example (https://github.com/h2oai/sparkling-water/blob/rel-2.2/py/examples/scripts/H2OContextInitDemo.py):
from pysparkling import *
from pyspark.sql import SparkSession
import h2o
# Initiate SparkSession
spark = SparkSession.builder.appName("App name").getOrCreate()
# Initiate H2OContext
hc = H2OContext.getOrCreate(spark)
# Stop H2O and Spark services
h2o.shutdown(prompt=False)
spark.stop()
I have SPARK_HOME exported and pointing to Spark 2.2.0. I have MASTER="local[4]".
I have installed (among others):
pyspark (2.2.0)
h2o-pysparkling-2.2 (2.2.2)
h2o (3.14.0.7)
Now, when I'm running this script, I'm getting (Under Python 2.7):
H2O session _sid_9ee5 closed.
/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py:151: UserWarning: Stopping H2OContext. (Restarting H2O is not yet fully supported...)
warnings.warn("Stopping H2OContext. (Restarting H2O is not yet fully supported...) ")
11-02 17:37:43.710 10.0.1.62:54321 21323 Thread-28 INFO: Orderly shutdown: Shutting down now.
11-02 17:37:43.719 10.0.1.62:54321 21323 Thread-29 INFO: Orderly shutdown: Shutting down now.
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py", line 140, in <lambda>
atexit.register(lambda: h2o_context.stop_with_jvm())
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py", line 147, in stop_with_jvm
self.stop()
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py", line 153, in stop
self._jhc.stop(False)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/py4j/java_gateway.py", line 1133, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/py4j/protocol.py", line 327, in get_return_value
format(target_id, ".", name))
Py4JError: An error occurred while calling o32.stop
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py", line 140, in <lambda>
atexit.register(lambda: h2o_context.stop_with_jvm())
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py", line 147, in stop_with_jvm
self.stop()
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pysparkling/context.py", line 153, in stop
self._jhc.stop(False)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/py4j/java_gateway.py", line 1133, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/home/user/.virtualenvs/sacred2/local/lib/python2.7/site-packages/py4j/protocol.py", line 327, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o32.stop
Why am I getting those tracebacks? Return code of script is 0, also in Python 3 but some other tracebacks are thrown. How to clean this up?
Complete log: https://gist.github.com/anonymous/163fba371b2a419c2171f4aff83a1ff7