[Feature Request] Support configurable batch/percentage rollout strategy (maxUnavailable) for large-scale clusters

We are currently operating a large-scale Milvus cluster with **over 100 QueryNodes** and **more than 100 million documents.** 
While the operator works reliably, **we are facing a significant operational bottleneck during rolling updates** (configuration changes or image updates).

As identified in the source code
https://github.com/zilliztech/milvus-operator/blob/9181cb92360935493fa8b75cd5ec7d3353f65dc0/pkg/controllers/components.go#L561-L569

https://github.com/zilliztech/milvus-operator/blob/9181cb92360935493fa8b75cd5ec7d3353f65dc0/pkg/controllers/deploy_ctrl_util.go#L438-L493

The `planScaleForRollout` logic adjusts replicas one by one (Old deploy _-1_, Current deploy _+1_). 
Furthermore, the reconcile loop is hardcoded to requeue at `unhealthySyncInterval/2` (`unhealthySyncInterval` is 30s) while a rollingUpdate is in progress.

For a cluster with just 35 pods, the theoretical minimum time for a rollout is `35 * 15 * 2= 1050s`. 
In our production environment with 100+ nodes and large data volumes, the recovery time for each QueryNode significantly extends this window, making deployments take several hours.

**Describe the solution you'd like**
I am aware of the discussion in [#459](https://github.com/zilliztech/milvus-operator/issues/459) regarding the removal of `twoDeployMode`. 
However, since that transition may take time and might require Kubernetes 1.34+ for certain native features, I would like to propose an interim enhancement for the current twoDeployMode.

I suggest allowing users to configure a **`rolloutStrategy`** within the Milvus CRD:

```yaml
  components:
    rollingMode: 2
    queryNode:
      rolloutStrategy: 
        maxUnavailable: "20%" # Should support both percentage (e.g., "20%") and integers (e.g., 5)
```
- The `DeploymentStrategy.RollingUpdate.MaxUnavailable` is set using this `maxUnavailable` value.

- The `planScaleForRollout` logic could use this `maxUnavailable` value to calculate a larger `replicaChange` step, allowing multiple pods to be updated in a single reconcile cycle.
 
- Making the requeue/sync intervals (like `unhealthySyncInterval`) configurable by the user would provide much-needed flexibility for different cluster scales.

I would love to hear your thoughts on this proposal. If the maintainers agree with this direction, I am more than happy to implement this feature and submit a Pull Request.
Thanks! 😊

	if useRollingUpdate {
	return appsv1.DeploymentStrategy{
	Type: appsv1.RollingUpdateDeploymentStrategyType,
	RollingUpdate: &appsv1.RollingUpdateDeployment{
	MaxUnavailable: &intstr.IntOrString{Type: intstr.Int, IntVal: 0},
	MaxSurge: &intstr.IntOrString{Type: intstr.Int, IntVal: 1},
	},
	}
	}

	// planScaleForRollout, if not hpa ,return nil
	func (c DeployControllerBizUtilImpl) planScaleForRollout(mc v1beta1.Milvus, currentDeployment, lastDeployment appsv1.Deployment) scaleAction {
	currentDeployReplicas := getDeployReplicas(currentDeployment)
	lastDeployReplicas := getDeployReplicas(lastDeployment)

	currentReplicas := currentDeployReplicas + lastDeployReplicas
	expectedReplicas := int(ReplicasValue(c.component.GetReplicas(mc.Spec)))
	if compareDeployResourceLimitEqual(currentDeployment, lastDeployment) {
	switch {
	case currentReplicas > expectedReplicas:
	if lastDeployReplicas > 0 {
	// continue rollout by scale in last deployment
	return scaleAction{deploy: lastDeployment, replicaChange: -1}
	}
	// scale in is not allowed during a rollout
	return noScaleAction
	case currentReplicas == expectedReplicas:
	if lastDeployReplicas == 0 {
	// stable state
	return noScaleAction
	}
	// continue rollout by scale out last deployment
	return scaleAction{deploy: currentDeployment, replicaChange: 1}
	default:
	// case currentReplicas < expectedReplicas
	// scale out
	return scaleAction{deploy: currentDeployment, replicaChange: expectedReplicas - currentReplicas}
	}
	} else {
	// Resource is changed.
	// If the lastDeployReplicas have not been scaled down to 0, we need to first scale up the currentDeployReplicas to the maximum value among the expectedReplicas or the lastDeployReplicas.
	// This ensures that during the subsequent scaling down process, pods will not experience out-of-memory (OOM) issues due to load balancing.
	// We only begin scaling down the lastDeployReplicas when the currentDeployReplicas is no less than lastDeployReplicas.
	// When the lastDeployReplicas reach 0, we need to ensure that the currentDeployReplicas are at their expected value.
	if lastDeployReplicas > 0 {
	if currentDeployReplicas < lastDeployReplicas \|\| currentDeployReplicas < expectedReplicas {
	// scale current deploy replica to max of lastDeployReplicas or expectedReplicas
	if lastDeployReplicas < expectedReplicas {
	return scaleAction{deploy: currentDeployment, replicaChange: expectedReplicas - currentDeployReplicas}
	}
	return scaleAction{deploy: currentDeployment, replicaChange: lastDeployReplicas - currentDeployReplicas}
	}
	// continue rollout by scale in last deployment
	return scaleAction{deploy: lastDeployment, replicaChange: -1}
	}
	if currentDeployReplicas > expectedReplicas {
	// scale current deploy replica to expected
	return scaleAction{deploy: currentDeployment, replicaChange: -1}
	} else if currentDeployReplicas < expectedReplicas {
	// scale current deploy replica to expected
	// This branch seems unlikely to occur.
	return scaleAction{deploy: currentDeployment, replicaChange: expectedReplicas - currentDeployReplicas}
	}
	return noScaleAction
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support configurable batch/percentage rollout strategy (maxUnavailable) for large-scale clusters #470

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support configurable batch/percentage rollout strategy (maxUnavailable) for large-scale clusters #470

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions