I have trained a custom Google Cloud Vision model using AutoML. The purpose of this model is to classify a single label for a given image.
I have implemented a client to send HTTP prediction requests to their REST API. This works perfectly fine, however the time it takes to get a response is 13 seconds. This seems extremely slow and inefficient to me. I am sure that this is caused by Google, since I timed the method calls (uploading the raw image data could take some time, but using the same image on their pre-trained Cloud Vision network is a lot faster).
Did anyone else run into this problem and found a solution for this? Or is it better to just train my own model using Tensorflow/Pytorch with transfer leaning on e.g. Imagenet and build an API around that.