How to scale S3 to thousands of requests per second?

Question

AWS S3 documentation states (https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html):

Amazon S3 automatically scales to high request rates. For example, your application can achieve at least 3,500 PUT/POST/DELETE and 5,500 GET requests per second per prefix in a bucket.

To test this I have the following NodeJS code (using aws-sdk) which asynchronously initiates 1000 uploads of zero bytes (hence, simply adding empty entries to the bucket). There is a timer to measure the throughput:

var t0 = new Date().getTime()
for (var i = 0; i < 1000; i++) {
  var s3 = new AWS.S3()
  var id = uuid()
  console.log('Uploading ' + id)
  s3.upload({
      Bucket: bucket,
      Body : '',
      Key : "test/" + id
    },
    function (err, data) {
      if (data) console.log('Uploaded ' + id + ' ' + (new Date().getTime() - t0))
      else console.log('Error')
    })
}

It takes approximately 25 seconds to complete all upload requests. This is obviously nowhere near the purported 3500 requests per second, rather it is approximately 40 requests per second.

I have approximately 1MB network upload speed and network stats show that for most of the time the bandwidth is only about 25% saturated. Equally, CPU utilisation is also low.

So the question is:

How can I scale S3 upload throughput to achieve something near the 3500 requests per second that can apparently be achieved?

EDIT:

I modified the code like this:

var t0 = new Date().getTime()
for (var i = 0; i < 1000; i++) {
  var s3 = new AWS.S3()
  var id = String.fromCharCode('a'.charCodeAt(0) + (i % 26)) + uuid()
  console.log('Uploading ' + id)
  s3.upload({
      Bucket: bucket,
      Body: '',
      Key: id
    },
    function (err, data) {
      if (data) console.log('Uploaded ' + id + ' ' + (new Date().getTime() - t0))
      else console.log('Error')
    })
}

This uses 26 different prefixes, which the AWS documentation claims should scale the throughput by a factor of 26.

"It is simple to increase your read or write performance exponentially. For example, if you create 10 prefixes in an Amazon S3 bucket to parallelize reads, you could scale your read performance to 55,000 read requests per second."

However, no difference in the throughput is apparent. There is some kind of difference in the behaviour such that the requests appear to complete in a more parallel, rather than sequential fashion - but the completion time is just about the same.

Finally, I tried running the application in x4 separate bash threads (4 threads, 4 cores, 4x1000 requests). Despite the added parallelism from using multiple cores the total execution time is about 80 seconds and therefore did not scale.

for i in {0..3}; do node index.js & done

I wonder if S3 rate-limits individual clients/IPs (although this does not appear to be documented)?

Where is your code running? It would be interesting to see what you get if you run it on an EC2 host in the same region as the S3 endpoint you're calling (so as to rule out any bottlenecks between you and AWS). Also -- I notice that you're creating a separate AWS.S3 instance for each request. I'm not really familiar with the Node AWS SDK, but my gut says that that is needlessly expensive, and that you could get better performance by creating a single instance, "warming it up" with a synchronous request, and then making the 1000 asynchronous requests that are your actual test. — ruakh
It's running on my local machine on my home network. I can initiate all 1000 requests in about 2 seconds so the S3 instance is not a bottleneck I don't think (plus the SDK warns about too many listeners if you use a single instance). It appears to be something on the S3 side. Would be interesting to try it on an EC2 instance though. — chris
Occasionally, all uploads complete in about 25s apart from a final handful which take around 70s to complete. It's as though S3 is re-balancing the indexes or something for these final few. — chris
Your application is single-threaded. This is not a realistic simulation of hundreds/thousands of users accessing S3. Your simulation should run parallel tests from multiple sources to obtain a true measure. Do you actually have an application that requires this throughput, or are you just doing it as an academic exercise? — John Rotenstein
I do have an application. It's an online backup app that will add potentially millions of small objects to S3 directly from a user's machine. So throughput is an issue. Yes, it is single-threaded but it's also asynchronous with the vast majority of time spent waiting for network responses meaning that CPU load is not really an issue. — chris

Matthew Pope Matthew Pope · Accepted Answer · 2019-03-31T02:21:27

I have a few things to mention before I give a straight answer to your question.

First, I did an experiment at one point, and I achieved 200000 PUT/DELETE requests in about 25 minutes, which is a little over 130 requests per second. The objects I was uploading were about 10 kB each. (I also had ~125000 GET requests in the same time span, so I’m sure that if I had only been doing PUTs, I could have achieved even higher PUT throughput.) I achieved this on a m4.4xlarge instance, which has 16 vCPUs and 64GB of RAM, that was running in the same AWS region as the S3 bucket.

To get more throughput, use more powerful hardware and minimize the number of network hops and potential bottlenecks between you and S3.

S3 is a distributed system. (Their documentation says the data is replicated to multiple AZs.) It is designed to serve requests from many clients simultaneously (which is why it’s great for hosting static web assets).

Realistically, if you want to test the limits of S3, you need to go distributed too by spinning up a fleet of EC2 instances or running your tests as a Lambda Function.

Edit: S3 does not make a guarantee for the latency to serve your requests. One reason for this might be because each request could have a different payload size. (A GET request for a 10 B object will be much faster than a 10 MB object.)

You keep mentioning the time to serve a request, but that doesn’t necessarily correlate to the number of requests per second. S3 can handle thousands of requests per second, but no single consumer laptop or commodity server that I know of can issue thousands of separate network requests per second.

Furthermore, the total execution time is not necessarily indicative of performance because when you are sending stuff over a network, there is always the risk of network delays and packet loss. You could have one unlucky request that has a slower path through the network or that request might just experience more packet loss than the others.

You need to carefully define what you want to find out and then carefully determine how to test it correctly.

How to scale S3 to thousands of requests per second?

2 Answers