I have tried training three UNet models using keras for image segmentation to assess the effect of multi-GPU training.
- First model was trained using 1 batch size on 1 GPU (P100). Each training step took ~254ms. (Note it is step, not epoch).
- Second model was trained using 2 batch size using 1 GPU (P100). Each training step took ~399ms.
- Third model was trained using 2 batch size using 2 GPUs (P100). Each training step took ~370ms. Logically it should have taken the same time as the first case, since both GPUs process 1 batch in parallel but it took more time.
Anyone who can tell whether multi-GPU training results in reduced training time or not? For reference, I tried all the models using keras.