Backport: MySQL version-aware reparent for PRS & ERS#866
Draft
ejortegau wants to merge 11 commits into
Draft
Conversation
Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
Patch version differences within the same MySQL release do not affect replication compatibility, so the version-aware candidate election now ignores them. Only major.minor boundaries trigger the preference for a lower-version primary. Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
PRS always catches the elected tablet up to the old primary's exact demotion position, so replication position head-start is irrelevant for data safety. This change introduces SortMode (SortForPRS/SortForERS) to give PRS a distinct sort order: promotion rules > MySQL version > position > buffer pool > alias. ERS retains position-first ordering to minimize data loss. Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
- Remove unused `after.ServerVersion` assignment in StopReplicationAndGetStatus - Log warning when MySQL version string fails to parse - Clarify findCandidate post-loop comment explaining two-phase logic - Add v25 changelog entry with deployment note and cross-cell limitation Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
- Add `server_version` field to `PrimaryStatus` proto so that DemotePrimary and PrimaryStatus RPCs report the MySQL version - Populate version in PrimaryStatus, DemotePrimary, and StopReplicationAndGetStatus RPCs - Move GetVersionString call after replication stop in StopReplicationAndGetStatus to avoid delaying the critical path - Read PrimaryStatus.ServerVersion in ERS ERNotReplica path so demoted primary-status candidates no longer get unknownVersion - Extract getMySQLVersion helper to deduplicate the fetch-and-warn pattern across 4 call sites - Add test coverage for version propagation through both paths Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
ReleaseAtLeast already covers the same-release case (Minor >=), so the redundant IsSameRelease check can be removed. Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
StopReplicationAndGetStatus had several early-return paths (IO thread already stopped, replication not healthy, stop failures, after-status failure) that returned Before without populating ServerVersion. ERS builds its version map from Before.ServerVersion, so tablets hitting these paths became unknownVersion and could lose version-aware election to newer tablets. Fix: call getMySQLVersion(ctx) before every return that includes Before. Add a table-driven test covering all return paths (success and error) for both IOTHREADONLY and IOANDSQLTHREAD modes. Ref: vitessio#20211 (review) Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
The cherry-picked test case references a primaryAlias struct field that doesn't exist on the slack-22.0 branch. Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
Trace version detection and candidate selection so we can verify the lowest-version preference is working as intended in production. Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
The successful stop path set Before.ServerVersion but left After.ServerVersion empty, while the no-op paths return After: before and thus include it. Set after.ServerVersion = before.ServerVersion so the common success path is consistent, and extend the test to assert After.ServerVersion when present. Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com> Signed-off-by: Eduardo Ortega <5791035+ejortegau@users.noreply.github.com>
This reverts commit 4bfe9c5.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's this?
Backport of the MySQL version-aware primary selection feature for PRS and ERS from upstream PR vitessio#20211.
When selecting a new primary during reparenting, this now prefers tablets running a lower MySQL release (major.minor) to maintain replication compatibility — replicas must be at the same or higher version than the primary.
How it works
server_versionfield to the replication status proto, populated duringStopReplicationAndGetStatussortTabletsForReparentto consider MySQL version after promotion rulesidentifyPrimaryCandidate, among same-tier candidates prefers the lowest MySQL releaseDifferences from upstream
log.Warningfinstead of structuredlog.Warn(slack-22.0 doesn't have the new logging)Cherry-picked and adapted by Claude Code from upstream commits.