I have successfully installed both h2o on my AWS Databricks cluster, and then successfully started the h2o server with:
h2o.init()
When I attempt to import the iris CSV file that is stored in my Databricks DBFS:
train, valid = h2o.import_file(path="/FileStore/tables/iris.csv").split_frame(ratios=[0.7])
I get an H2OResponseError: Server error water.exceptions.H2ONotFoundArgumentException
The CSV file is absolutely there; in the same Databricks notebook, I am able to read it directly into a DataFrame and view the contents using the exact same fully qualified path:
df_iris = ks.read_csv("/FileStore/tables/iris.csv")
df_iris.head()
I've also tried calling:
h2o.upload_file("/FileStore/tables/iris.csv")
but to no avail; I get H2OValueError: File /FileStore/tables/iris.csv does not exist. I've also tried uploading the file directly from my local computer (C drive), but that doesn't succeed either.
I've tried not using the fully qualified path, and just specifying the file name, but I get the same errors. I've read through the H2O documentation and searched the web, but cannot find anyone who has ever encountered this problem before.
Can someone please help me?
Thanks.