Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions src/zotero_arxiv_daily/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,24 @@ def normalize_include_path_patterns(include_path: list[str] | ListConfig | None)
return list(include_path)


def normalize_zotero_settings(zotero_config: DictConfig) -> tuple[str, str, str]:
user_id = str(zotero_config.get("user_id") or "").strip()
api_key = str(zotero_config.get("api_key") or "").strip()
library_type = str(zotero_config.get("library_type", "user")).strip().lower()

if not user_id or not api_key:
raise ValueError("config.zotero.user_id and config.zotero.api_key must be set.")

if library_type not in {"user", "group"}:
raise ValueError('config.zotero.library_type must be either "user" or "group".')
Comment on lines +35 to +41

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

library_type normalization treats an explicit null/None value as the string "none" (because str(None).lower()), which then fails validation. If users wire this to an env var with a null default (common with Hydra/OmegaConf), the config will error unexpectedly. Consider defaulting falsy values to "user" (e.g., apply or "user" before lowercasing) so missing/empty values behave like the intended default.

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +41

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a new config.zotero.library_type setting, but the repository’s shipped Hydra configs (config/base.yaml / config/custom.yaml) currently don’t mention it, so users won’t discover how to configure group libraries. Please update the default/custom config docs (and/or README) to document the new field and its expected values ("user"|"group").

Copilot uses AI. Check for mistakes.

return user_id, api_key, library_type
Comment on lines +32 to +43

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalize_zotero_settings introduces new validation behavior (required user_id/api_key, allowed library_type values) but there are no unit tests covering these cases. Since normalize_include_path_patterns has dedicated tests, adding similar tests for Zotero settings would prevent regressions (e.g., empty env vars, invalid library_type, default behavior).

Copilot uses AI. Check for mistakes.


class Executor:
def __init__(self, config:DictConfig):
self.config = config
self.zotero_user_id, self.zotero_api_key, self.zotero_library_type = normalize_zotero_settings(config.zotero)
self.include_path_patterns = normalize_include_path_patterns(config.zotero.include_path)
self.retrievers = {
source: get_retriever_cls(source)(config) for source in config.executor.source
Expand All @@ -40,10 +55,13 @@ def __init__(self, config:DictConfig):
self.openai_client = OpenAI(api_key=config.llm.api.key, base_url=config.llm.api.base_url)
def fetch_zotero_corpus(self) -> list[CorpusPaper]:
logger.info("Fetching zotero corpus")
zot = zotero.Zotero(self.config.zotero.user_id, 'user', self.config.zotero.api_key)
collections = zot.everything(zot.collections())
zot = zotero.Zotero(self.zotero_user_id, self.zotero_library_type, self.zotero_api_key)
try:
collections = zot.everything(zot.collections())
corpus = zot.everything(zot.items(itemType='conferencePaper || journalArticle || preprint'))
except Exception as e:
raise RuntimeError("Failed to access Zotero library. Please check user_id, api_key, and library_type.") from e

Copilot AI Mar 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The except Exception wrapper always raises a RuntimeError suggesting credentials/library type are wrong, but the underlying failure could also be network issues, Zotero downtime, rate limiting, or an unexpected response shape. This can be misleading when debugging. Consider either narrowing the caught exceptions to Zotero/HTTP-related ones, or including the original exception type/message in the raised error so operators can distinguish credential problems from transient failures.

Suggested change
raise RuntimeError("Failed to access Zotero library. Please check user_id, api_key, and library_type.") from e
raise RuntimeError(
"Failed to access Zotero library. This may be due to invalid user_id/api_key/library_type "
f"or a transient Zotero/network issue. Original error: {type(e).__name__}: {e}"
) from e

Copilot uses AI. Check for mistakes.
collections = {c['key']:c for c in collections}
corpus = zot.everything(zot.items(itemType='conferencePaper || journalArticle || preprint'))
corpus = [c for c in corpus if c['data']['abstractNote'] != '']
def get_collection_path(col_key:str) -> str:
if p := collections[col_key]['data']['parentCollection']:
Expand Down
Loading