Optional partial indexes to make ?unspent matching fast on non-pruned instances

### What is your idea? Provide a use case.

Add optional partial indexes `… WHERE spent_at IS NULL`, mirroring the existing match indexes so `GET /matches/{pattern}?unspent` is index-backed on non-pruned instances. Opt-in (e.g. an --index-unspent flag), off by default.

*Use case:* we run a non-pruned Kupo that serves two needs at once:

- Historical input resolution: resolving already-spent UTxOs by tx/slot, which needs the full spent history retained.
- A live "current UTxO set" view: `?unspent` queries over specific scripts/credentials.

The second is fast for low-churn patterns but pathological for busy ones — e.g. the unspent UTxOs at a busy script's payment credential, where that credential has ~4,000,000 rows total but only ~10,000 currently unspent.

### Why is it a good idea?

It gives prune-like speed for current-UTxO queries while keeping full history — otherwise mutually exclusive (see the alternatives below).

Measured on mainnet (~325 GB, ~350M rows). `EXPLAIN QUERY PLAN for WHERE payment_credential = ? AND spent_at IS NULL` today:

```sql
SEARCH inputs USING INDEX inputsByPaymentCredential (payment_credential=?)
```

It seeks all ~4M rows at the credential and filters `spent_at IS NULL` row-by-row → ~52 GiB read, ~170 s (the credential-wildcard form also tends to drop the connection). `spent_at IS NULL` matches the whole current UTxO set.

With a partial index:

```sql
CREATE INDEX inputsByPaymentCredentialUnspent
  ON inputs(payment_credential COLLATE NOCASE) WHERE spent_at IS NULL;
```

(the `COLLATE NOCASE` has to match the column collation or the planner won't choose it), the same query becomes a direct seek to only the unspent rows:

```sql
SEARCH inputs USING INDEX inputsByPaymentCredentialUnspent (payment_credential=?)
```

→ ~170 s → ~1.3 s for us.

The cost is modest and opt-in:

- Disk proportional to the unspent set, not history: ~1 GB for us (≈11 M unspent rows) vs the 325 GB table.
- Write path: one index insert on UTxO creation, one delete on spend — the same class of churn Kupo already pays maintaining `inputsBySpentAt`, just one more index.
- Off by default, so the "non-pruned + heavy ?unspent" niche pays for it and nobody else does.

### What is the current alternative and why is it not good enough?

- `--prune-utxo` makes `?unspent` trivially fast, but only by discarding the spent set, which our historical-resolution need depends on. The two are mutually exclusive, so it isn't an option for users of both use cases.
- Creating the index out-of-band on the SQLite file (what we do now) works, but it lives outside Kupo's migrations: lost on a resync and liable to collide with a future schema change. Hence asking for native support.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optional partial indexes to make ?unspent matching fast on non-pruned instances #206

What is your idea? Provide a use case.

Why is it a good idea?

What is the current alternative and why is it not good enough?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Optional partial indexes to make ?unspent matching fast on non-pruned instances #206

Description

What is your idea? Provide a use case.

Why is it a good idea?

What is the current alternative and why is it not good enough?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions