Skip to content

feat(clickhouse): sync.mode: mirror — differential delete (#340 Step 3)#598

Merged
masukai merged 1 commit into
mainfrom
feat/340-clickhouse-mirror-mode
May 31, 2026
Merged

feat(clickhouse): sync.mode: mirror — differential delete (#340 Step 3)#598
masukai merged 1 commit into
mainfrom
feat/340-clickhouse-mirror-mode

Conversation

@masukai

@masukai masukai commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

  • sync.mode: mirror for ClickHouse — Step 3 of #340, follow-up to #596 (Postgres) + #597 (MySQL)
  • Same application-side diff semantics: accumulate upsert_key tuples in load(), issue a single mutation from finalize_sync()
  • All safety paths preserved (empty-source short-circuit, state reset, row-error exclusion, missing-upsert_key ValueError)
  • 12 new unit tests in tests/unit/test_clickhouse_mirror_mode.py

clickhouse_connect translation notes

ClickHouse-specific differences from the Postgres / MySQL pattern:

Aspect Postgres / MySQL ClickHouse
DELETE form plain DELETE FROM ... WHERE ALTER TABLE ... DELETE WHERE mutation
Sync semantics implicit (single statement) requires mutations_sync=1 in settings=
Parameter binding psycopg2 / pymysql %s placeholders clickhouse_connect native {name:Type} with Array(...) types
Index usage direct column comparison hits index toString() cast skips column index

The toString() cast on both sides of the comparison means the same code path works for any column type, but loses index hits. Mirror mode is intended for small/medium reference tables, not high-volume fact tables — the docstring + CHANGELOG entry call this out explicitly so misuse is hard.

upsert_key handling

ClickHouseDestinationConfig.upsert_key is list[str] | None (informational only for the INSERT path, where dedup is handled by ReplacingMergeTree at merge time). Unlike Postgres / MySQL where the field is required at the config layer, ClickHouse mirror mode does its own runtime guard in load() that raises ValueError early — before any INSERT touches the table — when mirror mode is requested without a populated key.

Test plan

  • tests/unit/test_clickhouse_mirror_mode.py — 12 tests pass locally
  • tests/unit/test_clickhouse_destination.py — 38 existing tests still pass (swap path coexistence verified)
  • tests/unit/test_mysql_mirror_mode.py + test_postgres_mirror_mode.py — Step 1 + 2 unchanged, all 23 tests pass
  • ruff check — clean on drt/destinations/clickhouse.py + new test file
  • mypy — clean on drt/destinations/clickhouse.py
  • CI green (full matrix incl. clickhouse extras unlocked by feat(engine,postgres): sync.mode: mirror — differential delete (#340 Step 1) #596)
  • codecov patch coverage

Follow-ups (this issue)

🤖 Generated with Claude Code

ClickHouse counterpart to the Postgres / MySQL mirror mode shipped in
#596 / #597. Same application-side diff semantics: accumulate
upsert_key tuples across batches in load(), then issue a single ALTER
TABLE ... DELETE WHERE key NOT IN (collected) mutation from
finalize_sync() with mutations_sync=1 so the call blocks until the
DELETE completes.

clickhouse_connect's native {name:Type} parameter substitution accepts
Array(String) (single column) and Array(Tuple(String, ...)) (composite)
directly — so unlike Postgres / MySQL where we assembled placeholders
manually, the call site is one parameter dict. Both column references
and parameter values are coerced via toString() so the comparison
works regardless of source column type — at the cost of not using any
index on upsert_key. Mirror mode is intended for small/medium
reference tables; high-volume fact tables should use the temp-table
strategy follow-up (#340 follow-up).

ClickHouseDestinationConfig.upsert_key is list[str] | None (it's
informational only for the existing INSERT path where dedup is handled
by ReplacingMergeTree at merge time), so the runtime guard in load()
raises ValueError early when mirror mode is requested without a
populated key — fail-fast before any INSERT touches the table.

New _quote_ident helper added for db-qualified table identifiers
(`db`.`table`), matching the v0.7.4-hardened MySQL pattern.

12 unit tests in tests/unit/test_clickhouse_mirror_mode.py cover key
accumulation, dedupe across overlapping batches, database-qualified
DELETE shape, single + composite key DELETE structure, the empty-source
safety path, state reset, the missing-upsert_key ValueError, row-error
skip path, and coexistence with the existing EXCHANGE TABLES swap-
finalize path.

Snowflake follows in the next PR.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov

codecov Bot commented May 31, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@masukai masukai merged commit 4412a2b into main May 31, 2026
8 checks passed
@masukai masukai deleted the feat/340-clickhouse-mirror-mode branch May 31, 2026 14:44
@github-actions github-actions Bot locked and limited conversation to collaborators May 31, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant