I am running a Java application with Elastic Beanstalk.
I am somehow unable to handle more than about 25000 requests per minute (around 450 reqs/second),
irrespectively whether I add more or bigger instances.
I'm doing my tests from several JMeter clients simultaneously from around the globe (Ireland, Oregon, Tokyo, my local machine in Poland). Each on 100 Threads.
On CloudWatch metric I can see strange behaviours:
Metric on ELB with strange "throttle" behaviour:

When I connected via ssh to app server instances and checked network and CPU usage it looks that when "limit" is reached no requests are passed to any app server. No CPU usage, no network transfer (checked with nload). The app server instances are just iddle during that time (up to a minute).
CPU usage on all instances is far from maximum load they can handle. I tried single instance (without Beanstalk) and made around 45000 requests/minute (750 r/s),
First. I suspected that my test is wrong, and ELB redirects all requests to only one instance (due to DNS resolving), so I manually configured ELB (without Beanstalk) and manually attached to it several application servers. It distributed load perfectly - I reached around 60000 requests/minute, when my test DB on m3.medium started to be a bottleneck - that was OK and I expected it. Result from CloudWatch:
Same application deployed manually:

It ensures me, that my test is OK, and ELB is not a problem.
So, my question is: 1. Does Elastic Beanstalk apply any other limit on ELB or other services that it manages to limit a requests/minute load? 2. Do I somehow need to "unlock" ability to process more requests via Beanstalk?
My colleague suggested that it looks like some "anti DDoS system" but I couldn't find anything related to that.