pkg/util/schemacmp: allow utf8/latin1 < utf8mb4 charset/collation join#67778
pkg/util/schemacmp: allow utf8/latin1 < utf8mb4 charset/collation join#67778lance6716 wants to merge 10 commits intopingcap:masterfrom
Conversation
|
@lance6716 I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @lance6716. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds charset and collation lattice types and tests; removes explicit charset singletons from schema lattices in favor of deriving charset from stored collations; introduces incompatibility error constructors; updates encoding/decoding, tests, and Bazel build files. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Command failed Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/check-issue-triage-complete |
|
(legit failure in |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/util/schemacmp/charset_collation.go`:
- Around line 72-75: The comparison treats unknown charset values in
charsetFamilyOther as case-sensitive (a.value != b.value) causing GBK vs gbk to
be considered different; update the equality check inside the branch that
handles charsetFamilyOther to use a case-insensitive comparison (e.g., normalize
both values or use strings.EqualFold) instead of direct !=, and add the
suggested regression test in pkg/util/schemacmp/charset_collation_test.go
asserting Charset("GBK").Compare(Charset("gbk")) returns no error and cmp == 0
to prevent future regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 40f5d86c-81b2-4038-afc6-f0e66be34cf0
📒 Files selected for processing (9)
pkg/importsdk/BUILD.bazelpkg/util/schemacmp/BUILD.bazelpkg/util/schemacmp/charset_collation.gopkg/util/schemacmp/charset_collation_test.gopkg/util/schemacmp/lattice.gopkg/util/schemacmp/table.gopkg/util/schemacmp/table_test.gopkg/util/schemacmp/type.gopkg/util/schemacmp/type_test.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #67778 +/- ##
================================================
- Coverage 77.6075% 77.3623% -0.2453%
================================================
Files 1981 1967 -14
Lines 548811 550879 +2068
================================================
+ Hits 425919 426173 +254
- Misses 122082 124703 +2621
+ Partials 810 3 -807
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Signed-off-by: lance6716 <lance6716@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/util/schemacmp/charset_collation.go`:
- Around line 63-76: The Join method for charsetLattice (and the analogous
collation lattice at 131-144) must stop treating a Compare() error as a hard
failure and instead, when Compare reports incomparable, compute the least upper
bound by scanning known lattice members: implement Join to first try cmp :=
a.Compare(b) and return a or b when comparable, but if Compare returns an
incomparable error, iterate over a canonical list of all charset lattice members
(use or add a helper like allCharsets()) and pick the minimal candidate c such
that c.Compare(a) >= 0 and c.Compare(b) >= 0 (use Compare to test), returning
the smallest matching candidate; if none found return the original error. Apply
the same pattern for collationLattice, and add regression tests asserting
Charset("utf8").Join(Charset("latin1")) succeeds and
Collation("utf8_bin").Join(Collation("latin1_bin")) succeeds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c7909230-f861-42c9-913e-bdd80be2c9c2
📒 Files selected for processing (5)
pkg/util/schemacmp/charset_collation.gopkg/util/schemacmp/charset_collation_test.gopkg/util/schemacmp/table.gopkg/util/schemacmp/table_test.gopkg/util/schemacmp/type.go
✅ Files skipped from review due to trivial changes (1)
- pkg/util/schemacmp/charset_collation_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/util/schemacmp/table.go
- pkg/util/schemacmp/table_test.go
| func (a charsetLattice) Join(other Lattice) (Lattice, error) { | ||
| b, ok := other.(charsetLattice) | ||
| if !ok { | ||
| return nil, typeMismatchError(a, other) | ||
| } | ||
|
|
||
| cmp, err := a.Compare(b) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if cmp >= 0 { | ||
| return a, nil | ||
| } | ||
| return b, nil |
There was a problem hiding this comment.
Handle incomparable-but-joinable pairs in Join().
Both Join() implementations currently just call Compare(), so utf8 and latin1 still fail to join even though the new ordering makes both of them lower than utf8mb4. That breaks the lattice contract and makes multi-way joins order-dependent, e.g. (utf8 JOIN latin1) JOIN utf8mb4 fails at the first step even though the least upper bound exists.
🔧 Proposed fix
func (a charsetLattice) Join(other Lattice) (Lattice, error) {
b, ok := other.(charsetLattice)
if !ok {
return nil, typeMismatchError(a, other)
}
- cmp, err := a.Compare(b)
- if err != nil {
- return nil, err
- }
- if cmp >= 0 {
+ switch {
+ case a.value == b.value:
return a, nil
+ case a.value == tidbcharset.CharsetUTF8MB4 &&
+ (b.value == tidbcharset.CharsetUTF8 || b.value == tidbcharset.CharsetLatin1):
+ return a, nil
+ case b.value == tidbcharset.CharsetUTF8MB4 &&
+ (a.value == tidbcharset.CharsetUTF8 || a.value == tidbcharset.CharsetLatin1):
+ return b, nil
+ case (a.value == tidbcharset.CharsetUTF8 && b.value == tidbcharset.CharsetLatin1) ||
+ (a.value == tidbcharset.CharsetLatin1 && b.value == tidbcharset.CharsetUTF8):
+ return Charset(tidbcharset.CharsetUTF8MB4), nil
+ default:
+ return nil, incompatibleCharsetError(a.value, b.value)
}
- return b, nil
}
func (a collationLattice) Join(other Lattice) (Lattice, error) {
b, ok := other.(collationLattice)
if !ok {
return nil, typeMismatchError(a, other)
}
- cmp, err := a.Compare(b)
- if err != nil {
- return nil, err
+ if a.suffix != b.suffix {
+ return nil, incompatibleCollationError(a.unwrapString(), b.unwrapString())
}
- if cmp >= 0 {
- return a, nil
+
+ joinedCharset, err := a.charset.Join(b.charset)
+ if err != nil {
+ return nil, err
}
- return b, nil
+ return collationLattice{
+ charset: joinedCharset.(charsetLattice),
+ suffix: a.suffix,
+ }, nil
}Please also add regression coverage for Charset("utf8").Join(Charset("latin1")) and Collation("utf8_bin").Join(Collation("latin1_bin")).
Also applies to: 131-144
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/util/schemacmp/charset_collation.go` around lines 63 - 76, The Join
method for charsetLattice (and the analogous collation lattice at 131-144) must
stop treating a Compare() error as a hard failure and instead, when Compare
reports incomparable, compute the least upper bound by scanning known lattice
members: implement Join to first try cmp := a.Compare(b) and return a or b when
comparable, but if Compare returns an incomparable error, iterate over a
canonical list of all charset lattice members (use or add a helper like
allCharsets()) and pick the minimal candidate c such that c.Compare(a) >= 0 and
c.Compare(b) >= 0 (use Compare to test), returning the smallest matching
candidate; if none found return the original error. Apply the same pattern for
collationLattice, and add regression tests asserting
Charset("utf8").Join(Charset("latin1")) succeeds and
Collation("utf8_bin").Join(Collation("latin1_bin")) succeeds.
There was a problem hiding this comment.
Oh, very complicated 😂 not sure we should support it
There was a problem hiding this comment.
@lance6716 Totally fair — the utf8 JOIN latin1 case is genuinely unusual as an upgrade path and isn't part of the scope in issue #67777. The two intended paths (utf8 → utf8mb4 and latin1 → utf8mb4) work correctly with the current Join() implementation since those are directly comparable via Compare().
The edge case only matters if a DM task ever encounters a table where one side has utf8 and the other has latin1 (with neither being utf8mb4), which seems unlikely in practice. If that scenario comes up later it can be addressed separately. Happy to drop the suggestion.
🧠 Learnings used
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 66349
File: tests/integrationtest/r/tpch.result:447-447
Timestamp: 2026-03-11T03:48:51.420Z
Learning: In `pkg/planner/core/joinorder/` (TiDB join reorder), cartesian edges should only be considered in two specific scenarios:
1. After the entire join reorder is complete and there are still disjoint join groups that need to be connected.
2. When a `leading` hint explicitly requires a cartesian join.
PR `#66213` attempted to fix case 2 but accidentally enabled case 1 as well. The correct logic ensures cartesian edges are gated to only these two situations.
Learnt from: Reminiscent
Repo: pingcap/tidb PR: 65367
File: pkg/planner/core/rule_join_reorder_dp.go:336-364
Timestamp: 2026-04-15T02:49:15.099Z
Learning: In `pkg/planner/core/rule_join_reorder_dp.go` (pingcap/tidb PR `#65367`), `findNodeIndexForColumns` intentionally returns `plannererrors.ErrInternal` when substituted equality-edge columns span multiple join-group leaves. This error is NOT user-visible: `optimizeRecursive` in `rule_join_reorder.go` wraps the DP call with a `fallbackOnErr` closure that, when `TiDBOptJoinReorderThroughProj` is enabled, catches any error from the DP solver and reruns join reorder with `TiDBOptJoinReorderThroughProj` disabled, treating the blocking projection as an atomic leaf. The hard-fail path is a deliberate retry sentinel, not a user-facing planner error. Do not flag this pattern as incorrect in future reviews.
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67284
File: pkg/planner/core/joinorder/join_order.go:386-406
Timestamp: 2026-03-26T09:18:03.097Z
Learning: In `pkg/planner/core/joinorder/join_order.go` (pingcap/tidb), the DP join reorder `optimizeWithDetector` function has a sanity guard `nodeCount > 63` before allocating `bestPlan := make([]*Node, 1<<nodeCount)`. This guard is off-by-one: when `nodeCount == 63`, `1<<63` overflows a signed int on 64-bit Go causing a `makeslice` panic. The correct guard should be `nodeCount >= 63`. A follow-up to reduce the effective DP threshold below 63 is tracked in issue `#67325`.
Learnt from: cryo-zd
Repo: pingcap/tidb PR: 65532
File: pkg/planner/core/preprocess.go:1887-1898
Timestamp: 2026-03-12T08:41:44.509Z
Learning: In pkg/planner/core/preprocess.go (pingcap/tidb PR `#65532`), the `checkLockClauseTables` function intentionally accepts `FOR UPDATE OF <base_table>` even when the FROM item was introduced with an alias (`matched.alias.L != "" && !matchedByAlias`). Rather than returning an error (strict MySQL behavior), TiDB emits a warning to guide users toward the alias form. This backward-compatibility fallback is a deliberate acceptance criterion for the change — do not flag it as a MySQL-incompatibility bug in future reviews.
Learnt from: zimulala
Repo: pingcap/tidb PR: 67265
File: pkg/util/topsql/reporter/ru_datamodel_test.go:259-308
Timestamp: 2026-03-25T03:46:10.574Z
Learning: In `pkg/util/topsql/reporter/ru_datamodel_test.go` (pingcap/tidb PR `#67265`), `TestRUCollectingOthersWireLabelNoCollisionWithRuntimeUserShape` intentionally uses `"app127.0.0.1"` (not `othersUserWireLabel`) as the runtime user. The regression is scoped to runtime user shapes (`userhost` / empty string); broadening to arbitrary raw user strings matching the wire label is out of contract. The empty-user and merge-path regressions (`TestRUCollectingEmptyUserAndGlobalOthersRemainDistinct`, `TestRUCollectingMergeFromKeepsEmptyUserDistinctFromGlobalOthers`) plus aggregator-side assertion hardening collectively cover the fix.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: tests/realtikvtest/importintotest/import_into_test.go:253-267
Timestamp: 2026-03-11T06:26:14.065Z
Learning: In pingcap/tidb, PR `#66878` (cherry-pick of `#58401` to release-7.5) also bundles the follow-up nil-check fix from PR `#63146`. `TestOnUpdateColumn` in `tests/realtikvtest/importintotest/import_into_test.go` was cherry-picked from PR `#63146` and tests that `IMPORT INTO` on a table with an `ON UPDATE CURRENT_TIMESTAMP` column does NOT panic due to a nil `onDup` map in `resolveGeneratedColumns`. It is NOT intended to cover the transitive generated-column / index-inconsistency fix from `#58401/`#58400.
Learnt from: cryo-zd
Repo: pingcap/tidb PR: 65532
File: pkg/planner/core/preprocess.go:1906-1910
Timestamp: 2026-03-12T08:03:56.097Z
Learning: In pingcap/tidb (PR `#65532`), `buildSelect` in `pkg/planner/core/logical_plan_builder.go` derives the privilege-check DB for `FOR UPDATE OF` lock targets using `getLowerDB(tName.Schema, ...)`. For alias or unqualified lock targets whose `TableName.Schema` is empty after preprocessing, this falls back to `currentDB` and authorizes against the wrong database. The correct fix is to use `tNameW.DBInfo.Name.L` (with a nil guard) directly, since `tNameW` is already fully resolved by `checkLockClauseTables` in preprocess.go. This is consistent with how privilege checks for INSERT/DELETE use `tnW.DBInfo.Name.L` elsewhere in the codebase.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: pkg/planner/core/logical_plan_builder.go:6455-6472
Timestamp: 2026-03-11T06:29:00.122Z
Learning: Ensure code reviews verify the VirtualAssignmentsOffset semantics: the planner sets Update.VirtualAssignmentsOffset = len(update.List). The executor should only apply OrderedList[:VirtualAssignmentsOffset] when composing new rows, and only after the 'changed' check should it group/evaluate OrderedList[VirtualAssignmentsOffset:] per table. This pattern applies to files under pkg/planner/core and pkg/executor (e.g., common_plans.go and update.go). Reviewers should check that updates respect slicing behavior, that the offset is consistently derived from the planner, and that downstream code does not bypass the offset when creating new rows. Add tests validating both branches: the slice before the offset for new rows, and the per-table handling of the slice after the offset.
There was a problem hiding this comment.
@kennytm Should I follow this comment? Which means utf8 JOIN latin1 = utf8mb4. I think it's reasonable.
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
@lance6716 I think it is fine. The implication is that though, if we have 2 shards and the user goes
alter table t1 add column c text collate latin1_bin;
alter table t2 add column c text collate utf8mb3_bin;it will end up in downstream TiDB as (if i remembered what optimistic sharding works correctly):
alter table t add column c text collate latin1_bin;
alter table t modify column c text collate utf8mb4_bin; # ← result of joinThere was a problem hiding this comment.
My coding agent pushed a commit 67b337b for this comment but I haven't reviewed it 😂 I'll do a self review first
| collate := table[tableInfoTupleIndexCollate].(string) | ||
| if collate != "" { | ||
| cs, _, hasSuffix := strings.Cut(collate, "_") | ||
| ctx.WriteKeyWord(" CHARSET ") | ||
| ctx.WriteKeyWord(charset) | ||
| } | ||
| if collate := table[tableInfoTupleIndexCollate].(string); collate != "" { | ||
| ctx.WriteKeyWord(" COLLATE ") | ||
| ctx.WriteKeyWord(collate) | ||
| ctx.WriteKeyWord(cs) | ||
| if hasSuffix { | ||
| ctx.WriteKeyWord(" COLLATE ") | ||
| ctx.WriteKeyWord(collate) | ||
| } |
Signed-off-by: lance6716 <lance6716@gmail.com>
Signed-off-by: lance6716 <lance6716@gmail.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/util/schemacmp/charset_collation.go (1)
63-76:⚠️ Potential issue | 🟠 MajorReturn the least upper bound for
utf8/latin1joins.Both
Join()implementations still treatutf8vslatin1as a hard incompatibility because they only forward toCompare(). That meansCharset("utf8").Join(Charset("latin1"))andCollation("utf8_bin").Join(Collation("latin1_bin"))still fail even though the lattice defined above them has a valid LUB:utf8mb4/utf8mb4_bin. In practice, table joins remain order-dependent for the mixed-shard upgrade path this PR is meant to unblock.💡 Proposed fix
func (a charsetLattice) Join(other Lattice) (Lattice, error) { b, ok := other.(charsetLattice) if !ok { return nil, typeMismatchError(a, other) } - cmp, err := a.Compare(b) - if err != nil { - return nil, err - } - if cmp >= 0 { + switch { + case a.value == b.value: return a, nil + case a.value == tidbcharset.CharsetUTF8MB4 && + (b.value == tidbcharset.CharsetUTF8 || b.value == tidbcharset.CharsetLatin1): + return a, nil + case b.value == tidbcharset.CharsetUTF8MB4 && + (a.value == tidbcharset.CharsetUTF8 || a.value == tidbcharset.CharsetLatin1): + return b, nil + case (a.value == tidbcharset.CharsetUTF8 && b.value == tidbcharset.CharsetLatin1) || + (a.value == tidbcharset.CharsetLatin1 && b.value == tidbcharset.CharsetUTF8): + return Charset(tidbcharset.CharsetUTF8MB4), nil + default: + return nil, incompatibleCharsetError(a.value, b.value) } - return b, nil } func (a collationLattice) Join(other Lattice) (Lattice, error) { b, ok := other.(collationLattice) if !ok { return nil, typeMismatchError(a, other) } - cmp, err := a.Compare(b) - if err != nil { - return nil, err + if a.suffix != b.suffix { + return nil, incompatibleCollationError(a.unwrapString(), b.unwrapString()) } - if cmp >= 0 { - return a, nil + + joinedCharset, err := a.charset.Join(b.charset) + if err != nil { + return nil, err } - return b, nil + return collationLattice{ + charset: joinedCharset.(charsetLattice), + suffix: a.suffix, + }, nil }Please also add a direct regression case for
utf8↔latin1andutf8_bin↔latin1_bin.Also applies to: 128-141
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/util/schemacmp/charset_collation.go` around lines 63 - 76, The Join implementations for charsetLattice (and the analogous collation lattice) currently treat utf8 vs latin1 as incompatible because they simply forward to Compare; change Join in charsetLattice (and the corresponding collation join) to detect the utf8↔latin1 pair and return the least upper bound (utf8mb4 for charset, utf8mb4_bin for collation) instead of erroring: after type-asserting other to charsetLattice and before returning errors from Compare, add logic to recognize when one lattice value is Charset("utf8") and the other is Charset("latin1") (and similarly for Collation("utf8_bin") vs Collation("latin1_bin")) and return the appropriate LUB instance; update the Compare-based fallback to remain for other cases. Also add regression tests that call Charset("utf8").Join(Charset("latin1")) and Collation("utf8_bin").Join(Collation("latin1_bin")) and assert the returned LUB equals utf8mb4 / utf8mb4_bin.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/util/schemacmp/charset_collation.go`:
- Around line 63-76: The Join implementations for charsetLattice (and the
analogous collation lattice) currently treat utf8 vs latin1 as incompatible
because they simply forward to Compare; change Join in charsetLattice (and the
corresponding collation join) to detect the utf8↔latin1 pair and return the
least upper bound (utf8mb4 for charset, utf8mb4_bin for collation) instead of
erroring: after type-asserting other to charsetLattice and before returning
errors from Compare, add logic to recognize when one lattice value is
Charset("utf8") and the other is Charset("latin1") (and similarly for
Collation("utf8_bin") vs Collation("latin1_bin")) and return the appropriate LUB
instance; update the Compare-based fallback to remain for other cases. Also add
regression tests that call Charset("utf8").Join(Charset("latin1")) and
Collation("utf8_bin").Join(Collation("latin1_bin")) and assert the returned LUB
equals utf8mb4 / utf8mb4_bin.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a932c3a1-e67f-455d-a25c-102070788827
📒 Files selected for processing (7)
pkg/util/schemacmp/BUILD.bazelpkg/util/schemacmp/charset_collation.gopkg/util/schemacmp/charset_collation_test.gopkg/util/schemacmp/lattice.gopkg/util/schemacmp/table.gopkg/util/schemacmp/table_test.gopkg/util/schemacmp/type_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/util/schemacmp/type_test.go
- pkg/util/schemacmp/BUILD.bazel
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/util/schemacmp/charset_collation_test.go (1)
108-116: Add one negativeJoincase for suffix mismatch.
TestCollationJoincurrently covers only successful joins. Since the contract requires matching suffixes, adding one failing-join assertion will harden regression coverage for the join path itself (not just compare).Suggested test addition
func TestCollationJoin(t *testing.T) { join, err := Collation("utf8_bin").Join(Collation("latin1_bin")) require.NoError(t, err) require.Equal(t, "utf8mb4_bin", join.Unwrap()) join, err = Collation("latin1_bin").Join(Collation("utf8_bin")) require.NoError(t, err) require.Equal(t, "utf8mb4_bin", join.Unwrap()) + + _, err = Collation("utf8mb4_general_ci").Join(Collation("utf8mb4_0900_ai_ci")) + require.ErrorContains(t, err, "incompatible collation (utf8mb4_general_ci vs utf8mb4_0900_ai_ci)") }Based on learnings:
**/*.go : MUST add a regression test and verify it fails before fix and passes after fix for bug fixes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/util/schemacmp/charset_collation_test.go` around lines 108 - 116, Add a negative test case to TestCollationJoin to assert that Collation.Join returns an error when suffixes mismatch: call Collation("utf8_general_ci").Join(Collation("utf8_bin")) (or another pair with differing suffixes), require an error (require.Error/require.NotNil) and optionally assert the error message contains the suffix-mismatch hint; this targets the Collation.Join behavior (and the TestCollationJoin test) so the test fails before the fix and passes after.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/util/schemacmp/charset_collation_test.go`:
- Around line 108-116: Add a negative test case to TestCollationJoin to assert
that Collation.Join returns an error when suffixes mismatch: call
Collation("utf8_general_ci").Join(Collation("utf8_bin")) (or another pair with
differing suffixes), require an error (require.Error/require.NotNil) and
optionally assert the error message contains the suffix-mismatch hint; this
targets the Collation.Join behavior (and the TestCollationJoin test) so the test
fails before the fix and passes after.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 96243436-ea7b-488c-98bd-1ec62c37aa98
📒 Files selected for processing (3)
pkg/util/schemacmp/charset_collation.gopkg/util/schemacmp/charset_collation_test.gopkg/util/schemacmp/type_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/util/schemacmp/type_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/util/schemacmp/charset_collation.go
Signed-off-by: lance6716 <lance6716@gmail.com>
Signed-off-by: lance6716 <lance6716@gmail.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/util/schemacmp/charset_collation.go (1)
155-158: ReturnjoinErrdirectly in the collation join failure path.Line 157 currently returns the earlier
errfroma.Compare(b). ReturningjoinErris clearer and future-proof if charset join starts producing richer errors.♻️ Proposed refactor
joinCharset, joinErr := a.charset.Join(b.charset) if joinErr != nil { - return nil, err + return nil, joinErr }As per coding guidelines, "Keep error handling actionable and contextual; avoid silently swallowing errors."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/util/schemacmp/charset_collation.go` around lines 155 - 158, The collation join error path returns the wrong variable; when calling a.charset.Join(b.charset) and assigning joinCharset, joinErr, replace the erroneous return of the earlier err (from a.Compare(b)) with returning joinErr so the actual charset join error is propagated; update the failure branch in the function that calls a.Compare(b) and a.charset.Join(b.charset) to return joinErr instead of err.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/util/schemacmp/charset_collation.go`:
- Around line 155-158: The collation join error path returns the wrong variable;
when calling a.charset.Join(b.charset) and assigning joinCharset, joinErr,
replace the erroneous return of the earlier err (from a.Compare(b)) with
returning joinErr so the actual charset join error is propagated; update the
failure branch in the function that calls a.Compare(b) and
a.charset.Join(b.charset) to return joinErr instead of err.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e1a27928-23b5-422c-b143-82dc039ca920
📒 Files selected for processing (3)
pkg/util/schemacmp/charset_collation.gopkg/util/schemacmp/charset_collation_test.gopkg/util/schemacmp/table.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/util/schemacmp/table.go
- pkg/util/schemacmp/charset_collation_test.go
What problem does this PR solve?
Issue Number: close #67777
Problem Summary:
pkg/util/schemacmppreviously treatedcharset/collationas strict singletons, which makes schema compare/join fail for common upgrade paths likeutf8 (utf8mb3) -> utf8mb4orlatin1 -> utf8mb4.What changed and how does it work?
CharsetandCollationlattices with ordering:utf8 (utf8mb3) < utf8mb4latin1 < utf8mb4utf8_bin < utf8mb4_bin).types.FieldTypeandmodel.TableInfoinpkg/util/schemacmp.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Bug Fixes
Tests