I am trying to run algorithms in H2o as the dataset is quite large and its a regression problem
I am competing in a kernel only competition named Mercari Price suggestion challenge and thus it requires to run and check the code only in Kaggle Kernels.
I am using R language with an 8 GB RAM
Initially I was able to run glm model and save output csv with the following code
library(glm2)
glm.model2 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train1.h2o, validation_frame = valid1.h2o
,family = "gaussian")
Glm runs quickly in 12 sec without producing error but as soon as I try to run
either gbm or basic deep learning model it produces error
library(gbm)
h2o.gbm(y=y.dep, x=x.indep, training_frame = train1.h2o,validation_frame = valid1.h2o, ntrees = 2000, max_depth = 4, learn_rate = 0.01)
library(randomForest)
rforest.model <- h2o.randomForest(y=y.dep, x=x.indep, training_frame = train1.h2o,validation_frame = valid1.h2o, ntrees = 1000, mtries = 3, max_depth = 4, seed = 1122)
dlearning.model <- h2o.deeplearning(y = y.dep,
x = x.indep,
training_frame = train1.h2o,
validation_frame = valid1.h2o,
epoch = 60,
hidden = c(100,100),
activation = "Rectifier",
seed = 1122
)
I get the following error time and again. Please suggest what can be done to solve this problem as glm is running very fine but all other are not at all running
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: Failed to connect to localhost port 54321: Connection refused
Traceback:
It fails even after reaching 10 to 11 percent for both models and I want to know is there any hack or any measure so I can at least run these algorithms so that I can submit my result. I am unable to do built an ensemble model because of all this.
Any measure that can be used as I have run them in kaggle kernel only
library(h2o); localH2O = h2o.init(nthreads = -1) ;
? – MrSmithGoesToWashington