feat(solr): near-realtime loan availability updater#12689
Open
mekarpeles wants to merge 4 commits intomasterfrom
Open
feat(solr): near-realtime loan availability updater#12689mekarpeles wants to merge 4 commits intomasterfrom
mekarpeles wants to merge 4 commits intomasterfrom
Conversation
Adds a loan_availability_updater script that polls IA's loan changes API and atomically updates two new Solr fields on work documents: - ebook_availability: "available" | "unavailable" - ebook_becomes_available: ISO-8601 UTC loan expiry timestamp On startup (or after a full Solr re-index), the script binary-searches for the uid approximately 14 days ago so that all currently-active loans are reflected without replaying the entire history. Once caught up it polls every 30s; a full batch (1000 rows) is processed without sleeping. Expired loans are evicted each cycle via a Solr range query on ebook_becomes_available, so missed return/expire events self-heal. Files changed: - openlibrary/core/lending.py: add get_loan_changes() S3 GET helper - conf/solr/conf/managed-schema.xml: add ebook_availability / ebook_becomes_available fields - openlibrary/solr/solr_types.py: type-stub the two new fields - scripts/solr_updater/loan_availability_updater.py: main polling script - scripts/solr_updater/tests/test_loan_availability_updater.py: 20 unit tests Closes #7450
Contributor
There was a problem hiding this comment.
Pull request overview
Implements near‑realtime propagation of IA loan state into the OL Solr index by polling IA’s loans/loan/?action=changes endpoint and atomically updating new work-level fields indicating ebook availability and (when applicable) the time the loan becomes available again.
Changes:
- Add
openlibrary.core.lending.get_loan_changes()wrapper for the IA loan changes API (S3-authenticated). - Add two new Solr fields (
ebook_availability,ebook_becomes_available) plus Solr type stubs. - Introduce
loan_availability_updaterstandalone script (with unit tests) to poll changes, update Solr, and evict expired loans.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
openlibrary/core/lending.py |
Adds get_loan_changes() helper for IA loan changes polling with optional S3 auth. |
conf/solr/conf/managed-schema.xml |
Adds Solr schema fields to store near‑realtime loan availability state. |
openlibrary/solr/solr_types.py |
Extends Solr document TypedDict with new fields. |
scripts/solr_updater/loan_availability_updater.py |
New polling/processing loop, Solr atomic updates, startup UID search, and eviction safety net. |
scripts/solr_updater/tests/test_loan_availability_updater.py |
Unit tests for state handling, parsing, batch reduction, and binary-search startup logic. |
Fixes the python_tests CI failure: solr_types.py was manually edited with a Literal type and inline comment that the auto-generator doesn't produce; regenerate to match (string → Optional[str]). Cleanup pass: - Remove 35-line module docstring → 8 lines - Remove section-banner comments throughout - Remove BINARY_SEARCH_ITERS module constant → inline range(40) - Remove redundant local type annotations - Collapse new_uid/last_uid into a single variable computed once - Use contextlib.suppress (ruff SIM105) - DRY resolve_work_keys into a dict comprehension - Trim all function docstrings to one paragraph - Remove unused _make_loan_changes_response test helper - Fix test call-count assertion to match inline range(40)
for more information, see https://pre-commit.ci
- Add lending.setup(infogami.config) in main() so config globals are populated
- Wrap find_start_uid() API calls in try/except for network resilience
- Quote identifiers in resolve_work_keys() Solr query (ia:("a" "b" "c"))
- Only set ebook_becomes_available for active loans with a parseable until value
- Move write_state() to after successful Solr commit to keep state consistent
- Add resolve_work_keys unit tests (empty input + basic mapping + query quoting)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the near-realtime loan availability polling described in #7450. Adds a standalone
loan_availability_updaterscript that polls IA's loan changes API and atomically updates two new Solr fields so the search index reflects borrowing status within ~1 minute.New Solr fields (on work documents)
ebook_availabilitystring(docValues)"available"/"unavailable"ebook_becomes_availablepdateOutage / re-index recovery
On first run (no state file) or when called with
--reset, the script binary-searches the loan changes API for the uid ~14 days ago (the maximum loan lifetime). O(log N) API calls reconstruct the full picture of currently-active loans without replaying the entire history.Steady-state operation
ebook_becomes_available:[* TO NOW]and clears any stale unavailability markers (safety net for missed return/expire events)Files changed
openlibrary/core/lending.py—get_loan_changes(after_uid, limit, s3_keys)— S3-authenticated GET toservices/loans/loan/?action=changesconf/solr/conf/managed-schema.xml— two new fieldsopenlibrary/solr/solr_types.py— type stubsscripts/solr_updater/loan_availability_updater.py— main scriptscripts/solr_updater/tests/test_loan_availability_updater.py— 20 unit tests (pure functions + mocked Solr/API)Usage
What's left for @benbdeitch / maintainers
ebook_availability/ebook_becomes_availablein search result displaydocker-composeservice entry (similar totrending_updater) to keep it running in productionia_ol_metadata_write_s3) has access toservices/loans/loan/?action=changes, or add a dedicated config keyebook_availabilityshould be surfaced as a search facetopenlibrary/solr/solr_types.pyviapython openlibrary/solr/types_generator.pyafter schema is reviewed (file was manually updated here)Closes #7450