0
votes

I'm running cross-validation deep learning training (nfolds=4) iteratively for feature selection on H2O through R. Currently, I have only 2 layers (i.e. not deep) and between 8 and 50 neurons per layer. There are only 323 inputs, and 12 output classes.

To train one model takes in average around 40 seconds on my Intel 4770K, (32 GB ram). During training, H2o is able to max out all cpu cores.

Now, to try to speed up the training, I've set up an EC2 instance in the amazon cloud. I tried the largest compute unit (c4.8xlarge), but the speed up was minimal. It took around 24 seconds to train one model with the same settings. Therefore, I suspecting there's something I've overlooked. I started the training like this:

localH2O <- h2o.init(ip = 'localhost', port = 54321, max_mem_size = '24G', nthreads=-1)

Just to compare the processors, the 4770K got 10163 on cpu benchmark, while the Intel Xeon E5-2666 v3 got 24804 (vCPU is 36).

This speedup is quite disappointing to say the least, and is not worth all the extra work of installing and setting everything up in the amazon cloud, while paying over $2/hour.

Is there something else that needs to be done to get all cores working besides setting nthreads=-1 ?

Do I need to start making several clusters in order to get the training time down, or should I just start on a new deep learning library that supports GPUs?

1
Downvoters, please explain why you downvote.user979899
There is not enough elements to answer, here. And the question is too broad.YCR
My blog post for parallel acceleration for DNN compared with H2O.Patric
If the cpumark is reliable, you have 2.4 times more speed, but only got a 1.67 speed-up. I was going to suggest it might be the overhead from 36 cores, but I think it is actually 9 cores. (see the cpuinfo halfway down this page: cmips.net/2015/01/14/benchmarking-the-new-amazon-c4-instances ) I'm guessing you have either 4 or 8 cores on your own machine? It might be worth running some CPU benchmarks on each of your own machine, and the EC2 machine, to see if it is really 2.4 times quicker.Darren Cook

1 Answers

1
votes

To directly answer your question, no, H2O is not supposed to be slow. :-) It looks like you have a decent PC and the Amazon instances (even though there are more vCPUs) are not using the best processors (like what you would find in a gaming PC). The base / max turbo frequency of your PC's processor is 3.5GHz / 3.9GHz and the c4.8xlarge is only 2.9GHz / 3.5GHz.

I'm not sure that this is necessary, but since the c4.8xlarge instances have 60GB of RAM, you could increase max_mem_size from '24G' to at least '32G', since that's what your PC has, or even something bigger. (Although not sure that will do anything since memory is not usually the limiting factor, but may be worth a try).

Also, if you are concerned about EC2 price, maybe look into spot instances instead. If you require additional real speedup, you should consider using multiple nodes in your EC2 H2O cluster, rather than a single node.