Add ignore_path option to exclude Zotero papers by collection path#219
Conversation
…lection paths Agent-Logs-Url: https://github.com/TideDra/zotero-arxiv-daily/sessions/6de92dfe-9725-45bb-be70-34e26f62adc8 Co-authored-by: TideDra <92413813+TideDra@users.noreply.github.com>
ignore_path option to exclude Zotero papers by collection path
| return list(include_path) | ||
|
|
||
|
|
||
| def normalize_ignore_path_patterns(ignore_path: list[str] | ListConfig | None) -> list[str] | None: |
There was a problem hiding this comment.
This method can be merged with normalize_include_path_patterns because their code is almost the same.
There was a problem hiding this comment.
Done in f84f10c. Introduced a shared normalize_path_patterns(patterns, config_key) helper; normalize_include_path_patterns and normalize_ignore_path_patterns are now thin one-line wrappers that pass the appropriate config key name for error messages.
…path_patterns helper Agent-Logs-Url: https://github.com/TideDra/zotero-arxiv-daily/sessions/4fa8ac6d-72d2-41e8-95a2-fc6ae60850e5 Co-authored-by: TideDra <92413813+TideDra@users.noreply.github.com>
| return list(patterns) | ||
|
|
||
|
|
||
| def normalize_include_path_patterns(include_path: list[str] | ListConfig | None) -> list[str] | None: |
There was a problem hiding this comment.
There is no need to define normalize_include_path_patterns and normalize_ignore_path_patterns. Just normalize_path_patterns is enough
There was a problem hiding this comment.
Done in 64f381f. Removed both wrapper functions — all call sites in executor.py and the tests now call normalize_path_patterns directly with the appropriate config_key argument.
…_path_patterns directly Agent-Logs-Url: https://github.com/TideDra/zotero-arxiv-daily/sessions/da2848b7-2e90-4e06-a3ae-4faf74735134 Co-authored-by: TideDra <92413813+TideDra@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new Zotero config option (ignore_path) to exclude papers by collection path (glob patterns), acting as an inverse/override to include_path.
Changes:
- Extend config with
zotero.ignore_path(defaultnull) alongsideinclude_path. - Generalize path-pattern normalization into
normalize_path_patterns(...)and applyignore_pathfiltering afterinclude_path(so ignore wins). - Add tests covering ignore-path normalization, exclusion behavior, and precedence.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| config/base.yaml | Adds ignore_path: null with inline documentation/example. |
| src/zotero_arxiv_daily/executor.py | Introduces generic pattern normalization and applies ignore_path exclusion logic in filter_corpus(). |
| tests/test_include_path.py | Adds normalization and filtering tests for ignore_path and precedence behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_normalize_include_path_patterns_rejects_single_string(): | ||
| with pytest.raises(TypeError, match="config.zotero.include_path must be a list of glob patterns or null"): | ||
| normalize_include_path_patterns("2026/survey/**") | ||
| normalize_path_patterns("2026/survey/**", "include_path") | ||
|
|
||
|
|
||
| def test_normalize_include_path_patterns_accepts_list_config(): | ||
| include_path = OmegaConf.create(["2026/survey/**", "2026/reading-group/**"]) |
There was a problem hiding this comment.
The test function names still reference normalize_include_path_patterns, but the implementation under test was renamed to the generic normalize_path_patterns(...). Renaming these test functions to match the current API will make failures easier to interpret and avoid confusion while scanning the test output.
There was a problem hiding this comment.
Fixed in 768c98b. All five test functions that referenced the old wrapper names have been renamed to reflect the current normalize_path_patterns API (e.g. test_normalize_path_patterns_rejects_single_string_for_include_path, test_normalize_path_patterns_accepts_list_config_for_ignore_path, etc.).
Agent-Logs-Url: https://github.com/TideDra/zotero-arxiv-daily/sessions/05a07462-1470-4b1f-8911-bb5719d1eba3 Co-authored-by: TideDra <92413813+TideDra@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds
ignore_pathas the inverse counterpart toinclude_path— allows filtering out papers from specified Zotero collection paths using glob patterns.Changes
config/base.yaml: Newignore_path: nullfield alongsideinclude_pathexecutor.py:normalize_ignore_path_patterns()for config validation;filter_corpus()appliesignore_pathexclusion afterinclude_pathinclusion, soignore_pathtakes precedence when both are settests/test_include_path.py: Tests for normalization, exclusion logic, precedence overinclude_path, and empty-list behaviorExample
Papers in
2026/archive/or2026/misc/are excluded even though they matchinclude_path.Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
pyenv.run/usr/bin/curl curl -s REDACTED(dns block)If you need me to access, download, or install something from one of these locations, you can either: