1
votes

I am going to use AWS SQS(regular queue, not FIFO) to process different client side metrics.

I’m expect to have ~400 messages per second (worst case).My SQS message will contain S3 location of the file.

I created an application, which will listen to my SQS Queue, and process messages from it.

By process I mean:

  • read SQS message ->
  • take S3 location from that SQS message ->
  • call S3 client ->
  • Read that file ->
  • Add a few additional fields —>
  • Publish data from this file to AWS Kinesis Firehose.

Similar process will be for each SQS message in the Queue. The size of S3 file is small, less than 0,5 KB.


How can calculate if I will be able to process those 400 messages per second? How can I estimate that my solution would handle x5 increase in data?

2

2 Answers

3
votes

How can calculate if I will be able to process those 400 messages per second? How can I estimate that my solution would handle x5 increase in data?

Test it! Start with a small scale, and do the math to extrapolate from there. Make your test environment as close to what it will be in production as feasible.

  • On a single host and single thread, the math is simple:
    • 1000 / AvgTotalTimeMillis = AvgMessagesPerSecond, or
    • 1000 / AvgMessagesPerSecond = AvgTotalTimeMillis

How to approach testing this:

  • Start with a single thread and host, and generate some timing metrics for each step that you outlined, along with a total time.

    • Figure out your average/max/min time, and how many messages per second that translates to
    • 400 messages per second on a single thread & host would be under 3ms per message. Hopefully this makes it obvious you need multiple threads/hosts.
  • Scale up!

    • Now that you know how much a single thread can handle, figure out how many threads a single host can effectively handle (you'll need to experiment). Consider batching messages where possible - SQS provides batch operations.
    • Use math to calculate how many hosts you need
    • If you need 5X that number, go up from there
  • While you're doing this math, consider any limits of the systems you're using:

    • Review the throttling limits of SQS / S3 / Firehose / etc. If you plan to use Lambda to do the work instead of EC2, it has limits too. Make sure you're within those limits, and consider contacting AWS support if you are close to exceeding them.

A few other suggestions based on my experience:

  • Based on your workflow outline & details, using EC2 you can probably handle a decent number of threads per host
  • M5.large should be more than enough - you can probably go smaller, as the performance bottleneck will likely be networking I/O to fetch and send messages.
  • Consider using autoscaling to handle message spikes for when you need to increase throughput, though keep in mind autoscaling can take several minutes to kick in.
1
votes

The only way to determine this is to create a test environment that mirrors your scenario.

If your solution is designed to handle messages in parallel, it should be possible to scale-up your system to handle virtually any workload.

A good architecture would be to use AWS Lambda functions to process the messages. Lambda defaults to 1000 concurrent functions. So, if a function takes 3 seconds to run, it would support 333 messages per second consistently. You can request for the Lambda concurrency to be increased to handle higher workloads.

If you are using Amazon EC2 instead of Lambda functions, then it would just be a matter of scaling-out and adding more EC2 instances with more workers to handle whatever workload you desired.