feat(tests): file destination contract tests (Step 2a of #364 follow-up)#594
Merged
Merged
Conversation
…low-up) Mirrors PR #593's HTTP destination contract suite for destinations that write to disk. Same three universal invariants, same parametrised shape — only the "no side effects on empty input" check differs (a directory snapshot instead of an httpserver request log). What ships: - tests/contracts/test_destination_file_empty_batch.py - Parametrised over FileDestination in CSV / JSON / JSONL modes (9 tests = 3 formats × 3 contracts). - Parquet destination is appended conditionally — guarded by a `try / import pandas, pyarrow / append` block so the suite still collects when the [parquet] extras aren't installed. Adds 3 more tests when those deps are present. Demonstrates the extension pattern for destinations that depend on optional extras (relevant for the SQL suite in Step 2b: bigquery / snowflake / etc.). The three contracts: 1. isinstance(dest, Destination) — Protocol satisfaction. 2. load([]) returns SyncResult(success=0, failed=0, skipped=0). 3. load([]) leaves the filesystem untouched — no new files in tmp_path, not even a 0-byte placeholder. A 0-byte output would still indicate the destination opened a file handle before checking the record count, which is the bug the contract is designed to catch. Verified: pytest tests/contracts/ → 21 passed (12 from #593 HTTP + 9 from this PR file). make lint clean (115 source files). The Step 2b SQL suite (Postgres / MySQL / ClickHouse / Snowflake) needs a DB mock harness and is deferred to its own PR — those destinations use psycopg2 / pymysql / clickhouse-connect specifically, so a single in-memory substitute isn't viable. Likely path is per-destination connection mocking via unittest.mock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Step 2a of the destination contract framework started in #593. Mirrors the HTTP suite for destinations that write to disk — same three universal invariants, same parametrised shape; only the "no side effects on empty input" check differs (filesystem snapshot vs HTTP request log).
What ships
tests/contracts/test_destination_file_empty_batch.py— newFileDestinationin CSV / JSON / JSONL modes × 3 contractsParquetDestinationwhen[parquet]extras are present (pandas+pyarrow)The three contracts (same as #593)
isinstance(dest, Destination)— Protocol satisfaction.load([])returnsSyncResult(success=0, failed=0, skipped=0).load([])leaves the filesystem untouched — no new files intmp_path, not even a 0-byte placeholder. A 0-byte output would still indicate the destination opened a file handle before checking the record count — exactly the bug this contract catches.The conditional pattern (for Step 2b prep)
The parquet entry shows the pattern that the SQL suite will use for
[bigquery]/[snowflake]/ etc.:CI's minimal install (
[dev,mcp,duckdb]) skips it; a maintainer with[parquet]installed sees the extra coverage.Verification
pytest tests/contracts/→ 21 passed in 0.93s (12 HTTP from feat(tests): destination contract test framework + empty-batch invariant (Step 1) #593 + 9 file from this PR; parquet skipped on this venv)make lintclean (115 source files; ruff + mypy)### Addedexercises the chore(ci): warn on code-touching PRs without a CHANGELOG entry #590check-changelog-requiredguardWhy "no 0-byte file" matters
Several destinations call
os.makedirs(...)or open a file handle as the first I/O step insideload(). If they don't short-circuit on empty input first, you'd see a 0-byte CSV (or worse, a header-only one) appear on disk for what should be a no-op. Engine sync windows with no rows would silently corrupt downstream pipelines that watch for any new file. The currentFileDestination/ParquetDestinationimplementations do short-circuit (if not records: return SyncResult()) — this PR locks that behaviour so it can't regress.What's deferred (Step 2b)
psycopg2/pymysql/clickhouse-connect/snowflake-connector-python), so a single in-memory substitute isn't viable. The path is per-destination connection mocking viaunittest.mock. Own PR.StagedDestinationProtocol (Salesforce Bulk, Amazon Marketing Cloud) — different load shape (stage+finalize) so different contract surface.🤖 Generated with Claude Code