I have several Java projects running in Docker containers managed with Kubernetes. I want to enable the Horizontal Pod Autoscaling(HPA) based on CPU provided by Kubernetes, but I find it hard to deal with the initial CPU spikes caused by the JVM when initialising the container.
I currently have not set a cpu limit in the Kubernetes yaml files for any of the projects which basically means that I let the pods take as much CPU from the environment as they can (I know its a bad practice, but it lets me boot JVM pods in less than 30 seconds).
The problem this creates is that during the pod creation in the first 3-4 minutes the CPU usage will spike so much that If I have an autoscale rule set it will trigger it. Autoscaled pod will spin up and cause the same spike and re-trigger the autoscale until the maximum amount of pods are reached and things settle down.
I tried setting a cpu limit in the kubernetes yaml file but the amount if cpu that my projects need is not that big so by setting this to an non-overkill amount makes my pods spin up in more than 5min which is unacceptable.
I could also increase the autoscale delay to more than 10 minutes but its a global rule that will also affect deployments which I need to scale very fast, so that is also not a viable option for me.
This is an example cpu and memory configuration for one of my pods
env:
resources:
requests:
memory: "1300Mi"
cpu: "250m"
limits:
memory: "1536Mi"
I also migrated to Java 10 recently which is supposed to be optimised for containerisation. Any advice or comment will be much appreciated. Thanks in advance.
Edit:
I could also set up hpa based on custom prometheus metrics like http_requests, but that option will be harder to maintain since there lots of variables that can affect the amount of requests the pod can handle.