Skip to content

Native support for HorizontalPodAutoscaler#469

Open
nustiueudinastea wants to merge 13 commits intozilliztech:mainfrom
nustiueudinastea:feature/hpa-support
Open

Native support for HorizontalPodAutoscaler#469
nustiueudinastea wants to merge 13 commits intozilliztech:mainfrom
nustiueudinastea:feature/hpa-support

Conversation

@nustiueudinastea
Copy link
Copy Markdown
Contributor

This PR adds native support for HPA resources, managed directly by the operator. I have implemented this feature because when settings the replicas to -1, in combination with a user managed HPA, the operator fails to complete upgrades. This happens because of the dual deployments used for querynodes, where the operator is waiting for one of the deployments/replicasets to be scaled down, which never happens when being managed by the HPA.

So this PR introduces ability for the operator to create and manage HPA's by itself, and correctly manage the upgrade process.

  spec:
    components:
      queryNode:
        hpa:
          maxReplicas: 20

The upgrade process is the following:

  1. Delete HPA
  2. Scale up new deployment to old replica count set by HPA
  3. Scale down old deployment
  4. Re-create HPA

@sre-ci-robot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nustiueudinastea
To complete the pull request process, please assign yellow-shine after the PR has been reviewed.
You can assign the PR to them by writing /assign @yellow-shine in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 10, 2026

Codecov Report

❌ Patch coverage is 94.31818% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.20%. Comparing base (52b5430) to head (dd2e439).

Files with missing lines Patch % Lines
pkg/controllers/hpa.go 94.91% 3 Missing and 3 partials ⚠️
pkg/controllers/components.go 83.33% 1 Missing and 1 partial ⚠️
pkg/controllers/deploy_ctrl_util.go 94.28% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #469      +/-   ##
==========================================
+ Coverage   76.70%   77.20%   +0.50%     
==========================================
  Files          66       67       +1     
  Lines        6176     6347     +171     
==========================================
+ Hits         4737     4900     +163     
- Misses       1176     1179       +3     
- Partials      263      268       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native HorizontalPodAutoscaler (HPA) support to the Milvus operator so it can manage HPAs directly (create/update/delete) and adjust rollout scaling behavior to avoid upgrades getting stuck when replicas are managed by an HPA.

Changes:

  • Introduces spec.components.<component>.hpa API (CRD + Go types + deepcopy) and a new ReconcileHPAs reconciler.
  • Updates deployment scaling logic to better handle HPA-enabled components during two-deployment rollouts.
  • Adds unit tests and a sample manifest demonstrating native and legacy HPA approaches.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/controllers/milvus.go Adds ReconcileHPAs to the Milvus reconciliation pipeline.
pkg/controllers/hpa.go Implements HPA create/update/delete reconciliation and spec conversion helpers.
pkg/controllers/deploy_ctrl_util.go Adjusts HPA rollout scaling plan to preserve capacity during rollouts.
pkg/controllers/deployment_updater.go Treats HPA spec as enabling HPA and bootstraps replicas from minReplicas when needed.
pkg/controllers/components.go Adds component helpers for reading HPA spec + generating HPA names.
apis/milvus.io/v1beta1/components_types.go Adds HPASpec and hpa field to components API.
apis/milvus.io/v1beta1/zz_generated.deepcopy.go Adds deepcopy support for the new HPA fields/types.
config/crd/bases/milvus.io_milvuses.yaml Extends Milvus CRD schema with component hpa fields.
config/crd/bases/milvus.io_milvusclusters.yaml Extends MilvusCluster CRD schema with component hpa fields.
config/samples/hpa.yaml Documents native HPA config and a legacy external-HPA example.
pkg/controllers/hpa_test.go Adds unit tests for HPA reconciliation + rollout scaling planning.
pkg/controllers/deploy_ctrl_util_test.go Updates rollout scaling expectations for HPA behavior.
pkg/controllers/milvus_test.go Updates reconcile group size expectation due to new reconciler.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/controllers/hpa.go
Comment thread pkg/controllers/hpa.go
Comment thread pkg/controllers/deploy_ctrl_util.go Outdated
Comment thread pkg/controllers/hpa.go Outdated
@nustiueudinastea
Copy link
Copy Markdown
Contributor Author

nustiueudinastea commented Feb 12, 2026

@AlintaLu @LoveEachDay @haorenfsa hey folks, I implemented Copilots feedback, and extended the integration tests to cover the HPA situation. The PR is ready for review. Thanks!

@nustiueudinastea
Copy link
Copy Markdown
Contributor Author

I noticed PR #458 which tackles the same problem but differently. I think the addition of HPA support is more robust because there will not be any conflict between the operator and the external HPA, although both solutions can live along each other.

Alex Giurgiu added 7 commits February 12, 2026 21:46
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
@haorenfsa
Copy link
Copy Markdown
Collaborator

@nustiueudinastea Yes, I think so, too. Thank you for providing this patch.

Comment thread pkg/controllers/deploy_ctrl_util.go
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
@nustiueudinastea
Copy link
Copy Markdown
Contributor Author

Tests are now failing but it seems to be a problem related to kind.

@nustiueudinastea
Copy link
Copy Markdown
Contributor Author

@haorenfsa any chance you can take another look at this?

@haorenfsa
Copy link
Copy Markdown
Collaborator

@nustiueudinastea yes. I'm on vacation recently. sry for the delay.

@haorenfsa
Copy link
Copy Markdown
Collaborator

@AlintaLu please help review

@AlintaLu
Copy link
Copy Markdown
Collaborator

AlintaLu commented Mar 6, 2026

Great work @nustiueudinastea, thanks for providing native HPA support!

Could you please rebase on main to incorporate the latest changes and re-run CI? The latest GitHub Actions runner image appears to have resolved the Kind compatibility issue.

Alex Giurgiu added 3 commits March 11, 2026 15:07
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
Signed-off-by: Alex Giurgiu <agiurgiu@slb.com>
@nustiueudinastea
Copy link
Copy Markdown
Contributor Author

@AlintaLu @haorenfsa I think it's ready. Take a look, and thanks for your time!

@niharchinta
Copy link
Copy Markdown

+1 from me — I also need this feature/update. It would be great if this could be merged soon.

@nustiueudinastea
Copy link
Copy Markdown
Contributor Author

@AlintaLu @haorenfsa any chance we can move forward with this or you don't plan to merge it anymore?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants