Add spec.shards (named shards) API for MongoDB sharded clusters#1014
Draft
Add spec.shards (named shards) API for MongoDB sharded clusters#1014
Conversation
Introduces `spec.shards: [{shardName, shardId?}]` as an alternative to
`spec.shardCount` for declaring shards with stable, explicit identities.
The two forms are mutually exclusive and collapsed internally via a new
`ResolvedShards()` helper, so existing shardCount-based clusters are
unaffected and no yaml change is required after upgrade.
Webhook validation guards migrations: going from spec.shardCount to
spec.shards must preserve identity (each shardName at position i equals
the previously implicit "<mdb-name>-<i>") or the update is rejected.
Subsequent shards -> shards updates enforce shardId immutability per
shardName. The deprecated spec.shardSpecificPodSpec is forbidden in
named mode.
Testing:
- validation unit tests for mutex, DNS-1123, uniqueness, immutability,
migration-typo rejection, and reorder rejection
- full-reconcile tests against the fake client + mocked OM proving
that flipping from shardCount to spec.shards with identity-
preserving names does not change any StatefulSet spec or the OM
sharded-cluster configuration (the core safety invariant)
- e2e test `e2e_sharded_cluster_named_shards` covering: create with
shardCount, scale up, rejected migrations (typo + reorder), fixed
identity-preserving migration (asserts STS generation and AC version
unchanged), appending a custom-named shard, and removing an
index-based shard
Contributor
MCK 1.8.1 Release Notes |
…rsistent shard state, OM shardId plumbing - Rewrite removeUnusedStatefulsets to diff deployed shards vs desired by name. Tail-based removal deleted the wrong STS when removing a middle named shard; it now iterates names that disappeared and deletes each by shard-name + member-cluster. Trigger widened from count comparison to hasShardsToRemove so same-count swaps also run the cleanup. - Persist ShardStateEntry list in ShardedClusterDeploymentState as the source of truth for deployed shard names across reconciles. Falls back to LastAchievedSpec.Shards and then to synthesised names from Status.ShardCount so legacy state written by older operator versions continues to produce byte-identical results. - Split OM newShard into (id, rsName) and thread ShardIds through DeploymentShardedClusterMergeOptions so resolved ShardId lands in the automation config _id while the replica-set name remains the STS name. Legacy spec.shardCount path passes nil IDs — behaviour unchanged. - Export SynthesizedShardName from api/v1/mdb for reconciler use. - New unit test TestNamedShards_RemoveMiddleShardDeletesCorrectSts proves the fix; TestMigrateToNewDeploymentState updated for the new `shards` key in persisted state JSON.
…producer
The e2e test TestRemoveIndexBasedShard.test_remove_middle_shard failed on
ex=0 because TestCreateWithShardCount.shard_collection(shards_count=2) had
previously assigned zone-0/zone-1 with pinned chunk ranges to the first
two shards. MongoDB then blocks removeShard sh-named-shards-1 with
ZoneStillInUse ("only shard for zone zone-1 which has a chunk range"),
so the mongos agent can never acknowledge the new automation-config
version and the operator sits in Pending until the 1400s timeout.
- e2e: call `mongod_tester.prepare_for_shard_removal(...)` before
updating spec.shards to drop sh-named-shards-1 — this flattens zone
membership across the remaining shards so chunks in zone-1 can
migrate off.
- unit: add TestNamedShards_RemoveMiddleShardDoesNotCreateSpuriousSts,
the focused reproducer for the exact e2e scenario. It starts from
spec.shards=[A,B,C,extra-shard-alpha] (with a custom-named shard so
the synthesised tail name would no longer coincide with any real
deployed shard), drops the middle, and asserts:
* no spurious "mdbs-3" STS is ever created
* the correct STS (mdbs-1) is deleted
* OM shards[] array is exactly the desired set
* no mdbs-1 or mdbs-3 processes remain after finalize
The existing shardIdentityImmutable validator caught shardId rewrites (same shardName, different shardId) but not renames (same shardId, different shardName). Case 2 now builds a map by shardId as well and rejects a new shard whose shardId matches an old entry with a different shardName, with an explicit "in-place renames are not supported" message. Tests cover both the explicit-shardId rename case and the implicit case (old side defaults shardId from shardName, new side pins the old shardName as shardId while the shardName itself changes).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces
spec.shards: [{shardName, shardId?}]as an alternative tospec.shardCountfor declaring shards with stable, explicit identities. The two forms are mutually exclusive and collapsed internally via a newResolvedShards()helper, so existingshardCount-based clusters are unaffected and no yaml change is required after upgrade.Webhook validation guards migrations: going from
spec.shardCounttospec.shardsmust preserve identity (eachshardNameat positioniequals the previously implicit<mdb-name>-<i>) or the update is rejected. Subsequentshards -> shardsupdates enforceshardIdimmutability pershardName. The deprecatedspec.shardSpecificPodSpecis forbidden in named mode.Opens the door to two follow-up features not implemented here: VM-to-k8s migration (via
shardId != shardName) and shard-removal-by-name (drain state machine).Proof of Work
api/v1/mdb/named_shards_validation_test.go— validation cases: mutex, DNS-1123, uniqueness, immutability, migration-typo rejection, reorder rejectioncontrollers/operator/mongodbshardedcluster_controller_named_shards_test.go— full-reconcile tests against the fake client + mocked OM proving the core invariant: flippingshardCount->spec.shardswith identity-preserving names produces byte-identical StatefulSet specs and OM sharded-cluster configurationdocker/mongodb-kubernetes-tests/tests/shardedcluster/sharded_cluster_named_shards.py— e2e taske2e_sharded_cluster_named_shardscovering: create with shardCount, scale up, rejected migrations (typo + reorder), fixed identity-preserving migration (asserts STS.metadata.generationand ACversionunchanged), appending a custom-named shard, removing an index-based shardChecklist