Skip to content

feat(solr): near-realtime loan availability updater#12689

Open
mekarpeles wants to merge 4 commits intomasterfrom
7450/loan-availability-updater
Open

feat(solr): near-realtime loan availability updater#12689
mekarpeles wants to merge 4 commits intomasterfrom
7450/loan-availability-updater

Conversation

@mekarpeles
Copy link
Copy Markdown
Member

Summary

Implements the near-realtime loan availability polling described in #7450. Adds a standalone loan_availability_updater script that polls IA's loan changes API and atomically updates two new Solr fields so the search index reflects borrowing status within ~1 minute.

New Solr fields (on work documents)

Field Type Values
ebook_availability string (docValues) "available" / "unavailable"
ebook_becomes_available pdate ISO-8601 UTC loan expiry, or null

Outage / re-index recovery

On first run (no state file) or when called with --reset, the script binary-searches the loan changes API for the uid ~14 days ago (the maximum loan lifetime). O(log N) API calls reconstruct the full picture of currently-active loans without replaying the entire history.

Steady-state operation

  • Fetches up to 1000 events per cycle; processes immediately without sleeping if the batch is full (catching up fast)
  • Sleeps 30 s when caught up
  • Once per cycle: queries Solr for ebook_becomes_available:[* TO NOW] and clears any stale unavailability markers (safety net for missed return/expire events)
  • State file stores a single integer (last processed uid)

Files changed

  • openlibrary/core/lending.pyget_loan_changes(after_uid, limit, s3_keys) — S3-authenticated GET to services/loans/loan/?action=changes
  • conf/solr/conf/managed-schema.xml — two new fields
  • openlibrary/solr/solr_types.py — type stubs
  • scripts/solr_updater/loan_availability_updater.py — main script
  • scripts/solr_updater/tests/test_loan_availability_updater.py — 20 unit tests (pure functions + mocked Solr/API)

Usage

python scripts/solr_updater/loan_availability_updater.py \
  --ol-config /opt/openlibrary/conf/openlibrary.yml \
  [--state-file loan-availability-update.state] \
  [--poll-interval 30] \
  [--dry-run] \
  [--reset]

What's left for @benbdeitch / maintainers

  • Wire up ebook_availability / ebook_becomes_available in search result display
  • Add a docker-compose service entry (similar to trending_updater) to keep it running in production
  • Confirm the S3 key config (ia_ol_metadata_write_s3) has access to services/loans/loan/?action=changes, or add a dedicated config key
  • Decide whether ebook_availability should be surfaced as a search facet
  • Regenerate openlibrary/solr/solr_types.py via python openlibrary/solr/types_generator.py after schema is reviewed (file was manually updated here)

Closes #7450

Adds a loan_availability_updater script that polls IA's loan changes API
and atomically updates two new Solr fields on work documents:

- ebook_availability: "available" | "unavailable"
- ebook_becomes_available: ISO-8601 UTC loan expiry timestamp

On startup (or after a full Solr re-index), the script binary-searches
for the uid approximately 14 days ago so that all currently-active loans
are reflected without replaying the entire history. Once caught up it
polls every 30s; a full batch (1000 rows) is processed without sleeping.
Expired loans are evicted each cycle via a Solr range query on
ebook_becomes_available, so missed return/expire events self-heal.

Files changed:
- openlibrary/core/lending.py: add get_loan_changes() S3 GET helper
- conf/solr/conf/managed-schema.xml: add ebook_availability / ebook_becomes_available fields
- openlibrary/solr/solr_types.py: type-stub the two new fields
- scripts/solr_updater/loan_availability_updater.py: main polling script
- scripts/solr_updater/tests/test_loan_availability_updater.py: 20 unit tests

Closes #7450
Copilot AI review requested due to automatic review settings May 9, 2026 04:58
@github-actions github-actions Bot added the Priority: 2 Important, as time permits. [managed] label May 9, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements near‑realtime propagation of IA loan state into the OL Solr index by polling IA’s loans/loan/?action=changes endpoint and atomically updating new work-level fields indicating ebook availability and (when applicable) the time the loan becomes available again.

Changes:

  • Add openlibrary.core.lending.get_loan_changes() wrapper for the IA loan changes API (S3-authenticated).
  • Add two new Solr fields (ebook_availability, ebook_becomes_available) plus Solr type stubs.
  • Introduce loan_availability_updater standalone script (with unit tests) to poll changes, update Solr, and evict expired loans.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
openlibrary/core/lending.py Adds get_loan_changes() helper for IA loan changes polling with optional S3 auth.
conf/solr/conf/managed-schema.xml Adds Solr schema fields to store near‑realtime loan availability state.
openlibrary/solr/solr_types.py Extends Solr document TypedDict with new fields.
scripts/solr_updater/loan_availability_updater.py New polling/processing loop, Solr atomic updates, startup UID search, and eviction safety net.
scripts/solr_updater/tests/test_loan_availability_updater.py Unit tests for state handling, parsing, batch reduction, and binary-search startup logic.

Comment thread scripts/solr_updater/loan_availability_updater.py
Comment thread scripts/solr_updater/loan_availability_updater.py Outdated
Comment thread scripts/solr_updater/loan_availability_updater.py Outdated
Comment thread scripts/solr_updater/loan_availability_updater.py Outdated
Comment thread scripts/solr_updater/loan_availability_updater.py
Comment thread scripts/solr_updater/loan_availability_updater.py Outdated
mekarpeles and others added 3 commits May 8, 2026 23:07
Fixes the python_tests CI failure: solr_types.py was manually edited
with a Literal type and inline comment that the auto-generator doesn't
produce; regenerate to match (string → Optional[str]).

Cleanup pass:
- Remove 35-line module docstring → 8 lines
- Remove section-banner comments throughout
- Remove BINARY_SEARCH_ITERS module constant → inline range(40)
- Remove redundant local type annotations
- Collapse new_uid/last_uid into a single variable computed once
- Use contextlib.suppress (ruff SIM105)
- DRY resolve_work_keys into a dict comprehension
- Trim all function docstrings to one paragraph
- Remove unused _make_loan_changes_response test helper
- Fix test call-count assertion to match inline range(40)
- Add lending.setup(infogami.config) in main() so config globals are populated
- Wrap find_start_uid() API calls in try/except for network resilience
- Quote identifiers in resolve_work_keys() Solr query (ia:("a" "b" "c"))
- Only set ebook_becomes_available for active loans with a parseable until value
- Move write_state() to after successful Solr commit to keep state consistent
- Add resolve_work_keys unit tests (empty input + basic mapping + query quoting)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Priority: 2 Important, as time permits. [managed]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Book Availability in Solr

3 participants