I am trying to build a randomforest on a data set with 120k rows and 518 columns. I have two questions: 1. I want to see the progress and logs of building the forest. Is verbose option deprecated in randomForest function? 2. How to increase the speed? Right now it takes more than 6 hours to build a random forest with 1000 trees.
H2O cluster is initialized with below settings:
hadoop jar h2odriver.jar -Dmapreduce.job.queuename=devclinical -output temp3p -nodes 20 -nthreads -1 -mapperXmx 32g
h2o.init(ip = h2o_ip, port = h2o_port, startH2O = FALSE, nthreads=-1,max_mem_size = "64G", min_mem_size="4G" )