Add configurable arXiv cross-list support#195
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a configuration flag to optionally include arXiv cross-listed papers when retrieving papers for subscribed arXiv categories, preserving the existing default behavior.
Changes:
- Add
source.arxiv.include_cross_listto the base configuration (defaultfalse) - Update arXiv RSS filtering logic to optionally include both
newandcrossannounce types - Update README
CUSTOM_CONFIGexample and guidance to mention the new option
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/zotero_arxiv_daily/retriever/arxiv_retriever.py |
Adds configurable filtering to include cross-listed (cross) arXiv RSS entries when enabled. |
config/base.yaml |
Introduces the include_cross_list config key with a default of false. |
README.md |
Documents where/how to set include_cross_list in CUSTOM_CONFIG. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| allowed_announce_types = {"new", "cross"} if include_cross_list else {"new"} | ||
| all_paper_ids = [ | ||
| i.id.removeprefix("oai:arXiv.org:") | ||
| for i in feed.entries | ||
| if i.get("arxiv_announce_type", "new") in allowed_announce_types | ||
| ] |
There was a problem hiding this comment.
The new include_cross_list behavior isn’t covered by tests. There is an existing tests/retriever/test_arxiv_retriever.py that asserts only announce_type == "new" entries are included; please add a companion test that sets config.source.arxiv.include_cross_list = True and asserts announce_type == "cross" entries are included as well (the existing RSS fixture already contains cross entries).
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
感谢! |
Summary
This PR adds a config switch to control whether arXiv cross-listed papers should be included when retrieving papers from subscribed categories.
Previously, only papers with
arxiv_announce_type == "new"were included. This caused some relevant papers to be missed, even when they appeared in the official arXiv email for a subscribed category.Changes
source.arxiv.include_cross_listto the base configinclude_cross_list: falsenewandcrossentries when the switch is enabledCUSTOM_CONFIGto show where this option should be addedExample
Users can enable this in
CUSTOM_CONFIGwith:Why
Some papers are not submitted primarily to a subscribed category, but are cross-listed there. These papers can still be highly relevant, so making cross-list inclusion configurable helps users choose between: