if you are getting “pod didn’t trigger scale-up: 2 max node group size reached” or anything similar then you should focus on what is using so much CPU?
INFO
Events:
NotTriggerScaleUp –> cluster-autoscaler –> pod didn’t trigger scale-up: 2 max node group size reached
Warning FailedScheduling 40s (x1692 over 29h) default-scheduler 0/6 nodes are available: 6 Insufficient cpu.
you can use below command to find request percentages of all nodes.
1kubectl describe nodes | grep -A 3 'Resource'
2
3# OUTPUT
4 Resource Requests Limits
5 -------- -------- ------
6 cpu 3915m (99%) 4800m (122%)
7 memory 10874Mi (35%) 12810Mi (41%)
8--
9 Resource Requests Limits
10 -------- -------- ------
11 cpu 3855m (98%) 2 (51%)
12 memory 9064Mi (29%) 4864Mi (15%)
13--
14 Resource Requests Limits
15 -------- -------- ------
16 cpu 3755m (95%) 4300m (109%)
17 memory 6764Mi (22%) 8144Mi (26%)
WARNING
if its high like above then you better check deployment limits and requests and don’t forget to remove cpu limitation if there is any
If you wish to learn more about CPU requests and limits, click here