/area cluster-autoscaler
#9494 discovered and fixed a bug in granular resource quotas, in which balancing across similar node groups didn't respect the resource quotas of those similar node groups. The fix was to cap the scale ups in the similar node groups to their corresponding quotas after the balancing.
As discussed in #9494 (comment), that leads to suboptimal results. Example scenario: we have CapacityQuotas set to 3 nodes per each zone, and CA grabs unschedulable pods that need 9 new nodes. Theoretically, it can be satisfied within one scale up loop, but applyLimits will limit the node count to 3. If I'm not mistaken, if node groups' max sizes were used instead of capacity quotas, each node group would get 3 new nodes. Similarly, if zone a has 5 nodes remaining in the quota, and zones b and c have 1 remaining node, the current scale up logic will:
- pick some node group as the best option (honestly I'm not sure which one, probably neither will have a better score than another)
- if zone a is picked, scale up will be capped to 5 due to quotas
- balancing will balance the scale up across the zones, so we will get something like (2, 2, 1)
- scale up in zone b will be capped to 1 due to quotas, so the final scale up will be (2, 1, 1)
- if zone b or c is picked instead in the 1st step, we get only 1 node in the scale up
We can see that the optimal scenario would be to claim all the remaining quota, and initiate a (5, 1, 1) scale up. This is how NodeGroup.MaxSize() logic works. We should probably throw away applyLimits, and handle quotas similarly as we handle node groups' max size:
|
func (t *sngCapacityThreshold) computeNodeGroupCapacity(nodeGroup cloudprovider.NodeGroup) int { |
-
/area cluster-autoscaler
#9494 discovered and fixed a bug in granular resource quotas, in which balancing across similar node groups didn't respect the resource quotas of those similar node groups. The fix was to cap the scale ups in the similar node groups to their corresponding quotas after the balancing.
As discussed in #9494 (comment), that leads to suboptimal results. Example scenario: we have CapacityQuotas set to 3 nodes per each zone, and CA grabs unschedulable pods that need 9 new nodes. Theoretically, it can be satisfied within one scale up loop, but applyLimits will limit the node count to 3. If I'm not mistaken, if node groups' max sizes were used instead of capacity quotas, each node group would get 3 new nodes. Similarly, if zone a has 5 nodes remaining in the quota, and zones b and c have 1 remaining node, the current scale up logic will:
We can see that the optimal scenario would be to claim all the remaining quota, and initiate a (5, 1, 1) scale up. This is how
NodeGroup.MaxSize()logic works. We should probably throw awayapplyLimits, and handle quotas similarly as we handle node groups' max size:autoscaler/cluster-autoscaler/estimator/sng_capacity_threshold.go
Line 48 in 91080c8
autoscaler/cluster-autoscaler/processors/nodegroupset/balancing_processor.go
Line 95 in 91080c8