1
votes

I'm very new to Kubernetes. We are using Kubernetes cluster on Google Cloud Platform.

I have created Cluster, Services, Pod, Replica controllers.

I have created Horizontal Pod Autoscaler and it is based on CPU Params.

Cluster details

Default running node count is set to 3

3GB allocatable memory per node

Default running node count is 3 in the cluster.

After running for 1 hour Service and Nodes showing NodeUnderMemoryPressure Issues.

How to resolve this ?? If you any more details, please ask

Thanks

2
Can you check the memory consumption of your nodes? How much meory is in use, how many pods are running on your nodes. Is the error disappearing if you scale your amount of pods down?lvthillo

2 Answers

1
votes

I don't know how much traffic is hitting your cluster, but I would highly recommend running Prometheus in your cluster.

Prometheus is an open-source monitoring and alerting tool, and integrates very well with Kubernetes.

This tool should give you a much better view of memory consumption, CPU usage, amongst many other monitoring capabilities, that will allow you to effectively troubleshoot these types of issues.

0
votes

There are several ways to address this issue that depends on the type of your workloads.

The easiest is simply scale your nodes, but it can be useless if there is a memory leakage. Even if now you are not affected by it you should always consider the possibility of a memory leakage happening, therefore the best practise is to introduce always memory limits for PODs and Namespaces.

Scale the cluster

  • if you have many pods running and there are not some of them way bigger that the others it would be useful to scale horizontally your cluster, in this way the number of running pods per nodes will reduce and the NodeUnderMemoryPressure warning should disappear.

  • if you are running few PODs or some of them are capable to make the cluster suffering alone, then the only option is to scale the nodes vertically adding a new node pool with Compute Engine instances having more memory and possibly delete the old one.

  • if your workload is correct and you memory suffer because in certain moment of the day you receive 100 times more the usual traffic and you create more pods to support this traffic, you should consider to make use of the Autoscaler.

Check Memory leakages

On the other hand if it is not an "healthy" situation and you have pods consuming way more RAM than expected then you should follow the advice of grizzthedj and understand why your PODs are consuming so much and maybe verify if some of your container is affected by memory leakage and in this case scale the amount of RAM is useless since at some point you will run out of it anyway.

Therefore start to understand which are the PODs consuming too much and then troubleshoot why they have this behaviour, if you do not want to make use of Prometeus simply SSH into the container and check with the classical Linux commands.

Limit the RAM consumed by PODs

To prevent this to happen in the future I advise you when writing YAML file to always limit the amount of RAM they can make use of, in this way you will control them and you will be sure that there is not the risk that they cause the Kubernetes "node agent" to fail because out of memory.

Consider also to limit the CPU and introduce minimum requirements of both RAM and CPU for PODs to help the scheduler to properly schedule the PODs to avoid to hit NodeUnderMemoryPressure under high workload.