Skip to content

Redistribute unclaimed quota to similar node groups on scale-up#9830

Open
rrangith wants to merge 1 commit into
kubernetes:masterfrom
DataDog:fix/redistribute-quota-similar-nodegroups-9567
Open

Redistribute unclaimed quota to similar node groups on scale-up#9830
rrangith wants to merge 1 commit into
kubernetes:masterfrom
DataDog:fix/redistribute-quota-similar-nodegroups-9567

Conversation

@rrangith

@rrangith rrangith commented Jun 17, 2026

Copy link
Copy Markdown
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

Followup to #9494, which made similar-node-group balancing respect resource quotas but enforced them after balancing, capping the total scale-up to the best-option group's quota up front and dropping any unclaimed capacity. That left scale-ups suboptimal and order-dependent: with CapacityQuotas of 5/1/1 across three zones and a need for 9 nodes, the result was roughly (2,1,1) or (1,0,0) depending on which group the expander picked, instead of the optimal (5,1,1) achievable in one loop.

This PR makes BalanceSimilarNodeGroups quota-aware so a scale-up claims the full collective quota across similar node groups in a single loop, instead of being capped to the best option's quota up front.

Previously prepareScaleUp called applyLimits, which capped the total scale-up to the single best-option group's quota before balancing. With per-zone CapacityQuotas this produced suboptimal results (e.g. zones with quota 5/1/1 yielded ~(2,1,1) instead of the optimal (5,1,1)), and the outcome depended on which group the expander happened to pick.

Now balanceScaleUps computes each target group's quota headroom (in nodes) via a read-only Tracker.CheckQuota and passes it to BalanceScaleUpBetweenGroups as a maxAddByGroup map. Each group's effective max is lowered to min(MaxSize, currentSize+headroom), so balancing fills groups up to their quota and redistributes the remainder, all in one loop. The balancing algorithm itself is unchanged. capScaleUpsByQuota is kept as the safety net that commits quota and caps the residual for shared quotas (where the read-only per-group headroom over-counts).

A nil maxAddByGroup map preserves the original MaxSize-only behavior, so configurations without quotas are unaffected.

Which issue(s) this PR fixes:

Fixes #9567

Special notes for your reviewer:

For testing, I deployed this change to an AWS cluster. I had 3 similar node groups with 1 node each. One nodegroup had a quota of 5, and the other nodegroups had a quota of 2.

I created 9 new pods that each needed their own node and got the following scale-up plan: Final scale-up plan: [{asg-A 1->2 (max: 2)} {asg-B 1->2 (max: 2)} {asg-C 1->5 (max: 5)}] (real AutoScalingGroup names have been redacted)

We can see that all quotas were fully consumed and a total of 6 nodes were scaled up. Before this PR, we would have only gotten up to 4 (asg-C's remaining quota) new nodes.

Does this PR introduce a user-facing change?

Cluster Autoscaler now distributes a scale-up optimally across similar node groups when resource quotas are configured

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Jun 17, 2026
@k8s-ci-robot k8s-ci-robot requested review from elmiko and x13n June 17, 2026 18:27
@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler Issues or PRs related to the Cluster Autoscaler component needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 17, 2026
@rrangith

Copy link
Copy Markdown
Member Author

/assign @norbertcyran

Make BalanceSimilarNodeGroups quota-aware so a scale-up claims the full
collective quota across similar node groups in a single loop, instead of
being capped to the best option's quota up front.

Previously prepareScaleUp called applyLimits, which capped the total
scale-up to the single best-option group's quota before balancing. With
per-zone CapacityQuotas this produced suboptimal results (e.g. zones with
quota 5/1/1 yielded ~(2,1,1) instead of the optimal (5,1,1)), and the
outcome depended on which group the expander happened to pick.

Now balanceScaleUps computes each target group's quota headroom (in nodes)
via a read-only Tracker.CheckQuota and passes it to
BalanceScaleUpBetweenGroups as a maxAddByGroup map. Each group's effective
max is lowered to min(MaxSize, currentSize+headroom), so balancing fills
groups up to their quota and redistributes the remainder, all in one loop.
The balancing algorithm itself is unchanged. capScaleUpsByQuota is kept as
the safety net that commits quota and caps the residual for shared quotas
(where the read-only per-group headroom over-counts).

A nil maxAddByGroup map preserves the original MaxSize-only behavior, so
configurations without quotas are unaffected.
@rrangith rrangith force-pushed the fix/redistribute-quota-similar-nodegroups-9567 branch from dd1d427 to cd8ac79 Compare June 17, 2026 20:29
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@rrangith: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-autoscaling-e2e-gci-gce-ca-test cd8ac79 link false /test pull-autoscaling-e2e-gci-gce-ca-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jackfrancis

Copy link
Copy Markdown
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 18, 2026

@jackfrancis jackfrancis left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

This lgtm

/hold

for @norbertcyran to weigh in

Not sure this is really a bug fix?

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 18, 2026
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 18, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, rrangith

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2026
@rrangith

Copy link
Copy Markdown
Member Author

/kind feature

Thanks for the review! I'll change it to a feature based on the issue's label

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 19, 2026
@rrangith

Copy link
Copy Markdown
Member Author

/remove-kind bug

@k8s-ci-robot k8s-ci-robot removed the kind/bug Categorizes issue or PR as related to a bug. label Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler Issues or PRs related to the Cluster Autoscaler component cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Resource quotas] Redistribute unclaimed capacity to similar node groups

5 participants