Rate Limit GRPC connections based on authorization token in Nginx Ingress

Question

I am trying to rate limit number GRPC connections based on a token included in the Authorization header. I tried the following settings in the Nginx configmap and Ingress annotation but Nginx rate limiting is not working.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-ingress-controller
  namespace: default
data:
  http-snippet: |
    limit_req_zone $http_authorization zone=zone-1:20m rate=10r/m;
    limit_req_zone $http_token zone=zone-2:20m rate=10r/m;

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/configuration-snippet: |
      limit_req zone=zone-1;
      limit_req_log_level notice;
      limit_req_status 429;

I try to have Nginx Ingress Controller to rate limit the GRPC/HTTP2 stream connection based on the value in the $http_authorization variable. I have modified the Nginx log_format to log the $http_authorization value and observe that Nginx receives the value. The problem I am facing is that for some reason the rate limiting rule doesn't get triggered.

Is this the correct approach?

Any help and feedback would be much appreciated!

Thanks

Wytrzymały Wiktor Wytrzymały Wiktor · Accepted Answer · 2021-05-06T09:15:27

Hello Bobby_H and welcome to Stack Overflow!

When using Nginx Ingress on Kubernetes you can set up your rate limits with these annotations:

nginx.ingress.kubernetes.io/limit-connections: number of concurrent connections allowed from a single IP address. A 503 error is returned when exceeding this limit.

nginx.ingress.kubernetes.io/limit-rps: number of requests accepted from a given IP each second. The burst limit is set to this limit multiplied by the burst multiplier, the default multiplier is 5. When clients exceed this limit, limit-req-status-code default: 503 is returned.

nginx.ingress.kubernetes.io/limit-rpm: number of requests accepted from a given IP each minute. The burst limit is set to this limit multiplied by the burst multiplier, the default multiplier is 5. When clients exceed this limit, limit-req-status-code default: 503 is returned.

nginx.ingress.kubernetes.io/limit-burst-multiplier: multiplier of the limit rate for burst size. The default burst multiplier is 5, this annotation override the default multiplier. When clients exceed this limit, limit-req-status-code default: 503 is returned.

nginx.ingress.kubernetes.io/limit-rate-after: initial number of kilobytes after which the further transmission of a response to a given connection will be rate limited. This feature must be used with proxy-buffering enabled.

nginx.ingress.kubernetes.io/limit-rate: number of kilobytes per second allowed to send to a given connection. The zero value disables rate limiting. This feature must be used with proxy-buffering enabled.

nginx.ingress.kubernetes.io/limit-whitelist: client IP source ranges to be excluded from rate-limiting. The value is a comma separated list of CIDRs.

Nginx implements the leaky bucket algorithm, where incoming requests are buffered in a FIFO queue, and then consumed at a limited rate. The burst value defines the size of the queue, which allows an exceeding number of requests to be served beyond the base limit. When the queue becomes full, the following requests will be rejected with an error code returned.

Here you will find all important parameters to configure your rate limiting.

The number of expected successful requests can be calculated like this:

successful requests = (period * rate + burst) * nginx replica

so it is important to notice that the number of nginx replicas will also multiply the number of successful requests. Also, notice that Nginx ingress controller sets burst value at 5 times the limit. You can check those parameters at nginx.conf after setting up your desired annotations. For example:

limit_req_zone $limit_cmRfaW5ncmVzcy1yZC1oZWxsby1sZWdhY3k zone=ingress-hello-world_rps:5m rate=5r/s;
limit_req zone=ingress-hello-world_rps burst=25 nodelay;

limit_req_zone $limit_cmRfaW5ncmVzcy1yZC1oZWxsby1sZWdhY3k zone=ingress-hello-world_rpm:5m rate=300r/m;
limit_req zone=ingress-hello-world_rpm burst=1500 nodelay;

There are two limitations that I would also like to underline:

Requests are counted by client IP, which might not be accurate, or not fit your business needs such as rate-limiting by user identity.
Options like burst and delay are not configurable.

I strongly recommend to go through the below sources also to have a more in-depth explanation regarding this topic:

Rate Limit GRPC connections based on authorization token in Nginx Ingress

1 Answers