feat(snowflake): sync.mode: mirror — differential delete (#340 Step 4)#599
Merged
Conversation
Final SQL destination in the #340 set. Same application-side diff semantics as Postgres / MySQL / ClickHouse: accumulate upsert_key tuples across batches during load(), then issue a single DELETE FROM <db>.<schema>.<table> WHERE key NOT IN (collected) from finalize_sync(). Snowflake-specific wrinkles: 1. The destination has a pre-existing `config.mode: insert | merge` field (orthogonal to `sync_options.mode`) controlling whether the write path is plain INSERT or staging-table-plus-MERGE. Since mirror semantics intrinsically require upsert, sync.mode: mirror forces the MERGE write path regardless of config.mode — users only need to set destination.upsert_key + sync.mode: mirror. 2. Snowflake had no `finalize_sync` method previously (no swap-replace path). This PR adds it; it returns None for any non-mirror mode so the engine's existing dispatch is unchanged. 3. The snowflake-connector-python driver uses %s placeholders (same family as psycopg2 / pymysql) and does not auto-expand a tuple-of- tuples, so the DELETE placeholder shape is built explicitly: - single column: WHERE col NOT IN (%s, %s, ...) - composite: WHERE (c1, c2) NOT IN ((%s, %s), (%s, %s), ...) Safety paths preserved from prior steps: - empty source → _mirror_keys stays None → finalize returns None (transient empty source can't wipe the destination) - finalize_sync resets _mirror_keys so re-runs start fresh - rows that failed during the staging INSERT are excluded from the key set (only successfully-staged keys count as "source state") - ValueError at load time, before any Snowflake write, when upsert_key is empty 12 unit tests in tests/unit/test_snowflake_mirror_mode.py cover key accumulation, the merge-path forcing (CREATE TEMP TABLE + MERGE INTO fired even with config.mode: insert), dedupe, schema-qualified DELETE shape against ANALYTICS.PUBLIC.USER_SCORES, single + composite key forms, empty-source safety, state reset, missing-upsert_key ValueError, row-error skip path, and the non-mirror finalize short-circuit. Tests use sys.modules injection (matching test_snowflake_destination.py pattern), so no snowflake-connector-python install is required. Closes #340 for the SQL destination set; temp-table strategy for high-cardinality tables remains a future follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sync.mode: mirrorfor Snowflake — Step 4 of #340, follow-up to #596 (Postgres) + #597 (MySQL) + #598 (ClickHouse)upsert_keytuples inload(), issue a singleDELETE FROM <db>.<schema>.<table> WHERE key NOT IN (collected)fromfinalize_sync()tests/unit/test_snowflake_mirror_mode.pySnowflake-specific wrinkles
config.modefieldconfig.mode: insert | merge(orthogonal tosync_options.mode).sync.mode: mirrorforces the MERGE write path regardless ofconfig.mode— users only setdestination.upsert_key+sync.mode: mirror.finalize_syncfinalize_syncmethod; returnsNonefor any non-mirror mode so the engine dispatch is unchanged.%splaceholdersTest plan
tests/unit/test_snowflake_mirror_mode.py— 12 tests pass locallytests/unit/test_snowflake_destination.py— 27 existing tests still pass (no regression from theeffective_moderefactor)ruff check— clean ondrt/destinations/snowflake.py+ new test filemypy— clean ondrt/destinations/snowflake.pysys.modulesinjection — nosnowflake-connector-pythonextras needed for tests)Coverage of #340 after this PR
Follow-ups (this issue)
🤖 Generated with Claude Code