Skip to content

feat(snowflake): sync.mode: mirror — differential delete (#340 Step 4)#599

Merged
masukai merged 1 commit into
mainfrom
feat/340-snowflake-mirror-mode
May 31, 2026
Merged

feat(snowflake): sync.mode: mirror — differential delete (#340 Step 4)#599
masukai merged 1 commit into
mainfrom
feat/340-snowflake-mirror-mode

Conversation

@masukai

@masukai masukai commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

  • sync.mode: mirror for Snowflake — Step 4 of #340, follow-up to #596 (Postgres) + #597 (MySQL) + #598 (ClickHouse)
  • Final SQL destination in the feat: sync.mode: mirror — differential delete for stale rows #340 set — closes the issue for the SQL family (BigQuery is a separate destination tracked via contributor PR feat: add BigQuery destination #584; temp-table strategy for high-cardinality tables remains a future follow-up)
  • Same application-side diff semantics: accumulate upsert_key tuples in load(), issue a single DELETE FROM <db>.<schema>.<table> WHERE key NOT IN (collected) from finalize_sync()
  • 12 new unit tests in tests/unit/test_snowflake_mirror_mode.py

Snowflake-specific wrinkles

Aspect Behaviour
Pre-existing config.mode field Snowflake's destination has its own config.mode: insert | merge (orthogonal to sync_options.mode). sync.mode: mirror forces the MERGE write path regardless of config.mode — users only set destination.upsert_key + sync.mode: mirror.
No prior finalize_sync Snowflake had no swap-replace path. This PR adds the first finalize_sync method; returns None for any non-mirror mode so the engine dispatch is unchanged.
%s placeholders Same family as psycopg2 / pymysql. DELETE shape is identical to MySQL Step 2: explicit placeholder list (single column flat / composite row-major).

Test plan

  • tests/unit/test_snowflake_mirror_mode.py — 12 tests pass locally
  • tests/unit/test_snowflake_destination.py — 27 existing tests still pass (no regression from the effective_mode refactor)
  • All prior mirror suites — Postgres 11 + MySQL 12 + ClickHouse 12 = 35 tests, all pass
  • ruff check — clean on drt/destinations/snowflake.py + new test file
  • mypy — clean on drt/destinations/snowflake.py
  • CI green (uses sys.modules injection — no snowflake-connector-python extras needed for tests)
  • codecov patch coverage

Coverage of #340 after this PR

Destination PR Status
Postgres #596 ✅ shipped
MySQL #597 ✅ shipped
ClickHouse #598 ✅ shipped
Snowflake this PR 🟢 ready
BigQuery #584 (contributor) ⏳ contributor cycle

Follow-ups (this issue)

🤖 Generated with Claude Code

Final SQL destination in the #340 set. Same application-side diff
semantics as Postgres / MySQL / ClickHouse: accumulate upsert_key
tuples across batches during load(), then issue a single
DELETE FROM <db>.<schema>.<table> WHERE key NOT IN (collected)
from finalize_sync().

Snowflake-specific wrinkles:

1. The destination has a pre-existing `config.mode: insert | merge`
   field (orthogonal to `sync_options.mode`) controlling whether the
   write path is plain INSERT or staging-table-plus-MERGE. Since mirror
   semantics intrinsically require upsert, sync.mode: mirror forces the
   MERGE write path regardless of config.mode — users only need to set
   destination.upsert_key + sync.mode: mirror.

2. Snowflake had no `finalize_sync` method previously (no swap-replace
   path). This PR adds it; it returns None for any non-mirror mode so
   the engine's existing dispatch is unchanged.

3. The snowflake-connector-python driver uses %s placeholders (same
   family as psycopg2 / pymysql) and does not auto-expand a tuple-of-
   tuples, so the DELETE placeholder shape is built explicitly:

   - single column: WHERE col NOT IN (%s, %s, ...)
   - composite:     WHERE (c1, c2) NOT IN ((%s, %s), (%s, %s), ...)

Safety paths preserved from prior steps:
- empty source → _mirror_keys stays None → finalize returns None
  (transient empty source can't wipe the destination)
- finalize_sync resets _mirror_keys so re-runs start fresh
- rows that failed during the staging INSERT are excluded from the key
  set (only successfully-staged keys count as "source state")
- ValueError at load time, before any Snowflake write, when upsert_key
  is empty

12 unit tests in tests/unit/test_snowflake_mirror_mode.py cover key
accumulation, the merge-path forcing (CREATE TEMP TABLE + MERGE INTO
fired even with config.mode: insert), dedupe, schema-qualified DELETE
shape against ANALYTICS.PUBLIC.USER_SCORES, single + composite key
forms, empty-source safety, state reset, missing-upsert_key ValueError,
row-error skip path, and the non-mirror finalize short-circuit. Tests
use sys.modules injection (matching test_snowflake_destination.py
pattern), so no snowflake-connector-python install is required.

Closes #340 for the SQL destination set; temp-table strategy for
high-cardinality tables remains a future follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov

codecov Bot commented May 31, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@masukai masukai merged commit 2e96354 into main May 31, 2026
8 checks passed
@masukai masukai deleted the feat/340-snowflake-mirror-mode branch May 31, 2026 23:28
@github-actions github-actions Bot locked and limited conversation to collaborators May 31, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant