3
votes

While trying to train a lenet model for multiclass classification using h2o deepwater using mxnet backed I am getting the following errors:

Loading H2O mxnet bindings. Found CUDA_HOME or CUDA_PATH environment variable, trying to connect to GPU devices. Loading CUDA library. Loading mxnet library. Loading H2O mxnet bindings. Done loading H2O mxnet bindings. Constructing model. Done constructing model. Building network. mxnet data input shape: (32,100) [10:40:16] /home/jenkins/slave_dir_from_mr-0xb1/workspace/deepwater-master/thirdparty/mxnet/dmlc-core/include/dmlc/logging.h:235: [10:40:16] src/operator/./convolution-inl.h:349: Check failed: (dshape.ndim()) == (4) Input data should be 4D in batch-num_filter-y-x [10:40:16] src/symbol.cxx:189: Check failed: (MXSymbolInferShape(GetHandle(), keys.size(), keys.data(), arg_ind_ptr.data(), arg_shape_data.data(), &in_shape_size, &in_shape_ndim, &in_shape_data, &out_shape_size, &out_shape_ndim, &out_shape_data, &aux_shape_size, &aux_shape_ndim, &aux_shape_data, &complete)) == (0)

The details of my setup :
* Ubuntu : 16.04
* Ram : 12gb
* Graphics card : Nvidia 920mx driver version : 384.90
* Cuda : 8.0.61
* cudnn : 6.0
* R version : 3.4.3
* H2o version : 3.15.0.393 & h2o-R package : 3.16.0.2
* mxnet : 0.11.0
* Train data size : 400mb (when converting to the h2o frame object it comes around 822mb)

Things I have done :
1.) Gave enough memory to java heap while running h2o cluster (java -Xmx9g -jar h2o.jar)
2.) Build the mxnet from source for gpu
3.) Monitored the gpu and system via nvidia-smi and system monitor. At no point do they eat up all the ram to show "out of memory" issue. I still will be having around 2-3gb free before the error shows up
4.) Have tried with tensorflow-gpu(build from source). Checking the pip list made sure that its installed but during model creation in R it gives the error :
Error: java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: null
5.) The only method I got it the h2o deepwater to work with all the backend and w/wo GPU is through docker setup provided in the installation tutorials.

I wanted the same functionality on my laptop instead of using Docker. Also is there any way to run deepwater using just CPU? The link Is it possible to build Deep Water/TensorFlow model in H2O without CUDA doesn't provide any helpful answers. Any help or advice will be greatly appreciated!

1
I dont know if this is the correct but I got an info that in the recent H2O World 2017, H2O.ai stopped deepwater development and are recommending their own clients to use Keras instead.Jibin

1 Answers

3
votes

As evident from the error logs and from documentation of mxnet.sym.Convolution your data needs to be in [batch, channels, height, width] format. However it looks like your data contains only two dimensions (based on this log: mxnet data input shape: (32,100)). Reformatting the data, even including two dimensions of size 1 such that your input shape is (1,1,32,100) should resolve this issue.