I am attempting to perform Multi-GPU training with the TensorFlow Object Detection API.
What I see in my NVIDIA-SMI is that only 1 GPU is actually being utilized. The other 3 GPUs that are provided have the GPU process loaded to them, but memory usage is at 300MB and utilization sits at 0% at all times
I am using the SSD MobileNetV1 based network pretrained on COCO and then training it with my custom dataset.
I expect that when I provide Tensorflow with more GPUs, the framework will actually use them to speed up training.