Skip to content

feat: implement batch-based rollout and scaling using RollingUpdateDeployment#472

Open
sangheee wants to merge 4 commits intozilliztech:mainfrom
sangheee:feat/batch-rollout-strategy
Open

feat: implement batch-based rollout and scaling using RollingUpdateDeployment#472
sangheee wants to merge 4 commits intozilliztech:mainfrom
sangheee:feat/batch-rollout-strategy

Conversation

@sangheee
Copy link
Copy Markdown
Contributor

Hi, first of all, I would like to thank you for your positive feedback on my initial suggestions and for taking the time to review this PR.

This PR improves the rollout and scaling performance of Milvus components by implementing batch-based replica adjustment. It replaces the current hardcoded step-by-step (1, -1) approach with a configurable strategy using the standard Kubernetes DeploymentStrategy.

Changes

  • Added *appsv1.RollingUpdateDeployment to ComponentSpec.
  • Applied surgeStep and unavailableStep in planScaleForRollout for parallel pod replacement.

Even if podReplacementPolicy is introduced to Deployment and twoDeploymentMode is removed in the future, having this RollingUpdate configuration will be helpful as it can work alongside the policy to support both faster and more reliable updates.

Related Issue

Fixes #470

cc. @haorenfsa

@sre-ci-robot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sangheee
To complete the pull request process, please assign yellow-shine after the PR has been reviewed.
You can assign the PR to them by writing /assign @yellow-shine in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sangheee sangheee force-pushed the feat/batch-rollout-strategy branch 3 times, most recently from 7eee51c to 0882b89 Compare February 20, 2026 06:32
Signed-off-by: park.sanghee <park.sanghee@navercorp.com>
@sangheee sangheee force-pushed the feat/batch-rollout-strategy branch from 0882b89 to da386b8 Compare February 20, 2026 06:44
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 78.57143% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.64%. Comparing base (52b5430) to head (9fde88b).

Files with missing lines Patch % Lines
pkg/controllers/components.go 68.42% 3 Missing and 3 partials ⚠️
pkg/controllers/deploy_ctrl_util.go 90.00% 1 Missing and 1 partial ⚠️
pkg/controllers/deployment_updater.go 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #472      +/-   ##
==========================================
- Coverage   76.70%   76.64%   -0.06%     
==========================================
  Files          66       66              
  Lines        6176     6196      +20     
==========================================
+ Hits         4737     4749      +12     
- Misses       1176     1180       +4     
- Partials      263      267       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: park.sanghee <park.sanghee@navercorp.com>
@haorenfsa
Copy link
Copy Markdown
Collaborator

Thank you, @AlintaLu please help review

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds configurable Kubernetes-style rolling update parameters (maxSurge / maxUnavailable) to Milvus component specs and uses them to adjust replica changes in larger batches during two-deployment rollouts, improving rollout/scaling performance for large clusters.

Changes:

  • Added rollingUpdate (*appsv1.RollingUpdateDeployment) to ComponentSpec and merged it via MergeComponentSpec.
  • Updated deployment strategy resolution to accept *v1beta1.MilvusSpec and apply per-component rollingUpdate overrides.
  • Updated planScaleForRollout to compute batch steps from maxSurge / maxUnavailable, plus corresponding CRD and test updates.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/controllers/deployments.go Minor logging change for deployment diffs.
pkg/controllers/deployment_updater.go Updates strategy plumbing to pass *MilvusSpec into GetDeploymentStrategy.
pkg/controllers/deploy_ctrl_util.go Implements batch-based rollout scaling using rollingUpdate-derived step sizes.
pkg/controllers/deploy_ctrl_util_test.go Sets default rolling update strategy in test fixtures to match new rollout logic dependencies.
pkg/controllers/components.go Allows per-component rollingUpdate overrides when using RollingUpdate strategy.
pkg/controllers/components_test.go Adds tests for rollingUpdate merge + custom strategy behavior.
apis/milvus.io/v1beta1/components_types.go Adds the rollingUpdate API field to ComponentSpec.
apis/milvus.io/v1beta1/zz_generated.deepcopy.go Updates deepcopy generation for the new rollingUpdate field.
config/crd/bases/milvus.io_milvuses.yaml Adds rollingUpdate schema to Milvus CRD.
config/crd/bases/milvus.io_milvusclusters.yaml Adds rollingUpdate schema to MilvusCluster CRD.
charts/milvus-operator/templates/crds.yaml Propagates CRD schema changes into the Helm chart CRDs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/controllers/deploy_ctrl_util.go
Comment thread pkg/controllers/deploy_ctrl_util.go Outdated
Comment thread apis/milvus.io/v1beta1/components_types.go Outdated
Comment thread pkg/controllers/deploy_ctrl_util.go Outdated
…omment

Signed-off-by: park.sanghee <park.sanghee@navercorp.com>
@sangheee
Copy link
Copy Markdown
Contributor Author

Hi @AlintaLu , I've addressed all the suggestions and the PR is now ready for another look. Could you please take a look when you have a moment? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Support configurable batch/percentage rollout strategy (maxUnavailable) for large-scale clusters

4 participants