2
votes

It seems like Google Cloud storage can upload 500MB data in about 3 seconds (parallel upload) to 6 seconds (single-threaded upload), as per the 2nd graph in this link: https://cloud.google.com/blog/products/gcp/optimizing-your-cloud-storage-performance-google-cloud-performance-atlas

This translates, for the single-threaded upload case, to 500MB * 8 bits/byte / 6seconds = ~0.7Gbs

However, I'm confused: if my computer (in this case, a Google VM with 16 cores) can achieve anywhere from 10GBs (sustained) to 16Gbs (peak), can I upload multiple files (let's say ~20), and max out my computer's bandwidth (eg ~10 to ~16Gbs) and Google cloud storage would be able receive my data at that rate? (For simplicity, let's assume that the object names for these 20 objects are all very nicely distinct and so we're not getting into the "name contention" problem that Google suggests we avoid. And, if it makes a difference, the object blob sizes can be reasonably large, eg > 1 MB, so that we don't need to consider transmission overheads etc.)

If the answer is no, I'm guessing that Google Cloud Storage (and presumably S3 as well) places some max limit on the bandwidth for a single bucket? Or at least it places limits on total bandwidth originating from a single MAC address? (If so, what happens if the upload requests originate from different GCE VMs (with different MAC addresses & IPs etc? Would the aggregate bandwidth still be limited to under 2 Gbs? Is there anything we can do to lift this limit, if one truly exists?)

1

1 Answers

0
votes

GCS does have some API rate limits [1,2], but ultimately with the right usage patterns GCS can scale very high. At some point you will eventually hit infrastructural limits, at which point you need to talk to Google's sales team to plan what capacity you will need.

However this point is far far above what a single VM is capable of. You should be able to saturate a single VMs network.