1
votes

I have a problem when trying to use Keras with three GPUs.

My psuedocode is as follows:

import keras
import keras.models as M
from keras.utils import multi_gpu_model 

i = M.Input(None,None,6) 
o1,o2,o3 = my_Network(i)

net = M.Model(inputs = i, outputs = [o1,o2,o3])
net = multi_gpu_model(net,gpus = 3) 

net.compile( ~~~~~ ) 
net.fit(~~~~~ ) 

My code is training my network, however, only one GPU is utilised.

My configuration is as follows:

keras : 2.3.1

tensorflow : 2.1.0

Cuda : 10.0

windows : 10

GPU : Tesla 100 x 3 (VRAM : 32GB x 3 )

What is the mistake?

2
I'm not an expert on using multi gpu, but i check here and seems you need to build your model before do multi_gpu_model(net,gpus = 3) using cpu. - Augusto Maillo
Actually you just need to define your model once again using with tf.device(......): - Augusto Maillo
Then Should I write the code like following ? 'with tf.device('/gpu:0', '/gpu:1', '/gpu:2''): ~~~~~~~~~~' - MooNChilD Song
No, use tf.device("/cpu:0") and then declare your model. Seems that you must to do that because cpu device combines information about all gpus computations and then join them. But to do this, cpu needs a "copy" of your model - Augusto Maillo
Oh thanks for your advice. Then should I declare my model at tf.device("/cpu:0"): using multi_gpu_model and also fit the model in tf.device("/cpu:0")? - MooNChilD Song

2 Answers

1
votes

I solved my problem by using codes under :

str = tf.distribute.MirroredStrategy(devices=["/gpu:0","/gpu:1", "/gpu:2"])
with str.scope():
    epsnet = M.Model(inputs = [img_in,img_lv],outputs = [out_d,out_s,out_l])
    epsnet = multi_gpu_model(epsnet,gpus=3)

Hope this gives you some inspiration. Thank you all repliers.

0
votes

Something that you must consider is the batch size when executing fit. You aren't showing this here but you need to make sure that you are giving it a batch size that is divisible by 3 in order to parallelize it across your 3 GPUs. If you are giving it a batch size of 1, for example, it will not be able to distribute the training across the GPUs.

You did not provide much information but based on your execution of multi_gpu_model, I don't see anything clearly wrong.