fix: strip trailing slash from FilesystemClient.dataset_path and get_table_dir#3867
Open
mattiasthalen wants to merge 5 commits intodlt-hub:develfrom
Open
fix: strip trailing slash from FilesystemClient.dataset_path and get_table_dir#3867mattiasthalen wants to merge 5 commits intodlt-hub:develfrom
mattiasthalen wants to merge 5 commits intodlt-hub:develfrom
Conversation
OneLake (Microsoft Fabric) responds with 403 ClientAuthenticationError when BlobClient.exists targets a blob name ending in /. That kills FilesystemClient.initialize_storage at the very first fs.isdir call on self.dataset_path. Non-OneLake backends silently treat it as False and hit the same latent defect, just non-fatally. Strip the empty segment from the pathlib.join so dataset_path never ends in /. Refs dlt-hub#3866
Black wants the multi-line assert in test_dataset_path_has_no_trailing_separator reformatted into a single-line assert. Apply the formatter's output so `make format-check` passes in CI. Refs dlt-hub#3866
Same OneLake 403 root cause as the previous commit on dataset_path, one level deeper. FilesystemClient.truncate_tables calls fs.exists(table_dir) for each entry from get_table_dirs(...), which on OneLake 403s on every table once dataset_path is already fixed. Drop the trailing pathlib.sep so get_table_dir returns a path shape that BlobClient.exists accepts. Refs dlt-hub#3866
…nt paths Tasks 2 and 3 of this PR (dlt-hub#3867) stripped the trailing separator from `FilesystemClient.dataset_path` and `FilesystemClient.get_table_dir`. The pre-existing `test_trailing_separators` hardcoded the old shape (trailing /) in its parameterized assertions. Flip those seven assertions to the corrected shape. Also drop the stale "ending with separator" phrase from `get_table_dir`'s docstring — same invariant flip, land together. `get_table_prefix` is untouched and still preserves its trailing separator for folder-style layouts; the two assertions on that method stay as-is. Refs dlt-hub#3866
Task 2 of this PR (dlt-hub#3867) stripped the trailing separator from `FilesystemClient.dataset_path`. The pre-existing `test_destination_config_in_name` assertion at line 218 was `endswith(dataset_name + pathlib.sep)`, which encoded the old shape. Replace with `endswith(dataset_name)` and drop the now-unused `pathlib` local variable (and its `type: ignore` comment). Caught by `make test-common-p`, not surfaced by the filesystem test module run in Task 4 because this test lives under `tests/destinations/`. Refs dlt-hub#3866
Contributor
Author
Verification summaryAll local verification complete on
Ready for review. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
FilesystemClient.dataset_pathandFilesystemClient.get_table_dirboth return paths ending with a trailing separator. This is benign on most backends (404 fromBlobClient.exists, normalized toFalse), but OneLake (Microsoft Fabric) returns403 ClientAuthenticationErrorfor the same request, which fatally kills the load atinitialize_storageandtruncate_tables.This PR strips the trailing separator from both methods and adds regression tests that assert neither returned path ends with
/.Related Issues
FilesystemClienttrailing slash causes OneLake 403 atinitialize_storageandtruncate_tables#3866Additional Context
fabricdestination (Fabric Warehouse + OneLake staging) #3865 (Fabric notebook user-identity auth). That feature PR will rebase on this bugfix once merged.