What is your idea? Provide a use case.
Add optional partial indexes … WHERE spent_at IS NULL, mirroring the existing match indexes so GET /matches/{pattern}?unspent is index-backed on non-pruned instances. Opt-in (e.g. an --index-unspent flag), off by default.
Use case: we run a non-pruned Kupo that serves two needs at once:
- Historical input resolution: resolving already-spent UTxOs by tx/slot, which needs the full spent history retained.
- A live "current UTxO set" view:
?unspent queries over specific scripts/credentials.
The second is fast for low-churn patterns but pathological for busy ones — e.g. the unspent UTxOs at a busy script's payment credential, where that credential has ~4,000,000 rows total but only ~10,000 currently unspent.
Why is it a good idea?
It gives prune-like speed for current-UTxO queries while keeping full history — otherwise mutually exclusive (see the alternatives below).
Measured on mainnet (~325 GB, ~350M rows). EXPLAIN QUERY PLAN for WHERE payment_credential = ? AND spent_at IS NULL today:
SEARCH inputs USING INDEX inputsByPaymentCredential (payment_credential=?)
It seeks all ~4M rows at the credential and filters spent_at IS NULL row-by-row → ~52 GiB read, ~170 s (the credential-wildcard form also tends to drop the connection). spent_at IS NULL matches the whole current UTxO set.
With a partial index:
CREATE INDEX inputsByPaymentCredentialUnspent
ON inputs(payment_credential COLLATE NOCASE) WHERE spent_at IS NULL;
(the COLLATE NOCASE has to match the column collation or the planner won't choose it), the same query becomes a direct seek to only the unspent rows:
SEARCH inputs USING INDEX inputsByPaymentCredentialUnspent (payment_credential=?)
→ ~170 s → ~1.3 s for us.
The cost is modest and opt-in:
- Disk proportional to the unspent set, not history: ~1 GB for us (≈11 M unspent rows) vs the 325 GB table.
- Write path: one index insert on UTxO creation, one delete on spend — the same class of churn Kupo already pays maintaining
inputsBySpentAt, just one more index.
- Off by default, so the "non-pruned + heavy ?unspent" niche pays for it and nobody else does.
What is the current alternative and why is it not good enough?
--prune-utxo makes ?unspent trivially fast, but only by discarding the spent set, which our historical-resolution need depends on. The two are mutually exclusive, so it isn't an option for users of both use cases.
- Creating the index out-of-band on the SQLite file (what we do now) works, but it lives outside Kupo's migrations: lost on a resync and liable to collide with a future schema change. Hence asking for native support.
What is your idea? Provide a use case.
Add optional partial indexes
… WHERE spent_at IS NULL, mirroring the existing match indexes soGET /matches/{pattern}?unspentis index-backed on non-pruned instances. Opt-in (e.g. an --index-unspent flag), off by default.Use case: we run a non-pruned Kupo that serves two needs at once:
?unspentqueries over specific scripts/credentials.The second is fast for low-churn patterns but pathological for busy ones — e.g. the unspent UTxOs at a busy script's payment credential, where that credential has ~4,000,000 rows total but only ~10,000 currently unspent.
Why is it a good idea?
It gives prune-like speed for current-UTxO queries while keeping full history — otherwise mutually exclusive (see the alternatives below).
Measured on mainnet (~325 GB, ~350M rows).
EXPLAIN QUERY PLAN for WHERE payment_credential = ? AND spent_at IS NULLtoday:SEARCH inputs USING INDEX inputsByPaymentCredential (payment_credential=?)It seeks all ~4M rows at the credential and filters
spent_at IS NULLrow-by-row → ~52 GiB read, ~170 s (the credential-wildcard form also tends to drop the connection).spent_at IS NULLmatches the whole current UTxO set.With a partial index:
(the
COLLATE NOCASEhas to match the column collation or the planner won't choose it), the same query becomes a direct seek to only the unspent rows:SEARCH inputs USING INDEX inputsByPaymentCredentialUnspent (payment_credential=?)→ ~170 s → ~1.3 s for us.
The cost is modest and opt-in:
inputsBySpentAt, just one more index.What is the current alternative and why is it not good enough?
--prune-utxomakes?unspenttrivially fast, but only by discarding the spent set, which our historical-resolution need depends on. The two are mutually exclusive, so it isn't an option for users of both use cases.