Skip to content

feat(hooks): opt-in citation cache for source-driven-development#80

Merged
addyosmani merged 2 commits into
addyosmani:mainfrom
federicobartoli:feat/sdd-cache-hooks
Apr 19, 2026
Merged

feat(hooks): opt-in citation cache for source-driven-development#80
addyosmani merged 2 commits into
addyosmani:mainfrom
federicobartoli:feat/sdd-cache-hooks

Conversation

@federicobartoli

Copy link
Copy Markdown
Contributor

Summary

Re-opens #74 with a completely different approach after @addyosmani's
feedback. The citation cache now lives in two optional Claude Code
hooks
, not inside the skill.

  • hooks/sdd-cache-pre.sh — PreToolUse on WebFetch. If a cached
    entry exists, it issues a HEAD with If-None-Match /
    If-Modified-Since. On 304 Not Modified it blocks the fetch
    (exit 2) and returns the cached content via stderr. Otherwise it
    lets the fetch through.
  • hooks/sdd-cache-post.sh — PostToolUse on WebFetch. Captures the
    response together with the origin's current ETag /
    Last-Modified. Entries without a validator are never stored.

The source-driven-development skill itself is untouched.


I’m considering moving this to draft because I still have a few concerns around correctness and long-term behavior.

  • TTL as safety net
    I understand the 24h TTL is meant to mitigate misbehaving origins, but it can also introduce unnecessary invalidations. In practice, many documentation sites are relatively stable, so entries may be evicted even when the origin would still return 304 Not Modified.

    This means we might lose valid cache hits and fall back to a full fetch despite the content being unchanged, even though freshness is already delegated to the origin via HTTP validators. I’m wondering if relying entirely on validators (and making TTL optional or configurable) would lead to more consistent behavior.

  • Prompt stability as part of the cache key
    Since the cache key depends on (url + normalized_prompt), the hit rate relies on the agent reusing same prompt across sessions. Given that prompt phrasing is often not stable (even for the same intent), I’m wondering how reliable this is in real workflows, especially over time or across different agents or model versions.

    In my tests (using Claude Haiku), this worked quite well in practice. Once a page had been fetched (e.g. useState), subsequent requests with similar prompts consistently hit the cache, so the agent effectively read from the cached content instead of triggering a new fetch.

    However, I’m not sure how stable this behavior is under prompt drift, or when switching agents, models, or longer-lived sessions.

    However, I’m not sure how stable this behavior is under prompt drift, or when switching agents, models, or longer-lived sessions.

  • Use of HEAD for revalidation
    This is the part I’m least confident about. Many servers don’t implement HEAD consistently with GET (headers, caching behavior, etc.), which could lead to incorrect 304s or missed invalidations. Maybe a conditional GET would be more robust in practice, even if slightly more expensive, since it guarantees consistency with the actual fetch path.

That said, since this lives entirely in hooks and doesn’t affect the skill itself, I agree the risk surface is relatively contained and it’s easy to experiment with.

More generally: should HTTP validators be treated as the single source of truth for freshness, or is it acceptable to layer additional heuristics (like TTL) on top?


How this addresses the previous feedback on #74

Concern Resolution
"Cache contradicts SDD's 'verify against current docs' rule" Every reuse is verified by the origin via HTTP 304. A hit is a just-completed verification, not a memory read.
"Skill complexity budget" The skill is byte-for-byte unchanged. The cache is opt-in infrastructure, not a skill instruction.
"Cache invalidation is unsolved" Invalidation is delegated to the server's ETag / Last-Modified. A hard 24h TTL is the safety net for misbehaving origins.
"This belongs at the tool level" Implemented as Claude Code pre/post-tool-use hooks on WebFetch — the canonical tool-level extension point.

How it works

Cache key: sha256(url + normalized_prompt) (lowercase + whitespace
collapse). WebFetch output is prompt-dependent, so different prompts
on the same URL are separate entries. Stylistic variants
("Extract the signature" vs "extract the\nsignature") hit the
same key; semantically different prompts still miss.

Full setup, testing, and debugging docs:
hooks/SDD-CACHE.md.


Opt-in

Nothing happens until users explicitly register the hooks in
.claude/settings.json. The plugin manifest is not modified.
Follows the same opt-in pattern used by simplify-ignore.sh in this
repo.


Test plan

  • Post hook writes one JSON file per (url, prompt) on cache
    miss, stores ETag / Last-Modified.
  • Pre hook returns exit 2 with stderr payload on 304.
  • Pre hook returns exit 0 silently on non-304 (cache miss or
    server change).
  • Prompt normalization: case + whitespace variants hit the same
    cache entry; semantically different prompts miss.
  • Servers without validators are never cached (entries without
    ETag/Last-Modified are skipped on write and removed on next
    revalidation attempt).
  • 24h TTL bypass verified.
  • End-to-end in a real Claude Code session (react.dev):
    WebFetch → cached → same prompt re-fetch → 304 → cached content
    returned to the agent.
  • Reviewer smoke test — follow steps in hooks/SDD-CACHE.md.

Closes #74 (once this is accepted or rejected).

Adds a pair of optional Claude Code hooks that cache WebFetch output
on disk but revalidate every reuse against the origin. Content is
served only when the server returns 304 Not Modified, so
source-driven-development's "verify against current docs" guarantee
still holds across sessions.

- hooks/sdd-cache-pre.sh: PreToolUse hook. For a cached entry, issues
  a HEAD with If-None-Match / If-Modified-Since. On 304, blocks the
  WebFetch (exit 2) and returns cached content via stderr; otherwise
  allows the fetch through.
- hooks/sdd-cache-post.sh: PostToolUse hook. Captures response plus
  current ETag / Last-Modified. Entries without a validator are
  never stored — without one, the pre hook cannot verify freshness
  and caching would amount to trusting memory.
- Cache key: sha256(url + normalized_prompt). Prompt is lowercased
  and whitespace-collapsed so stylistic variants hit the same entry;
  semantically different prompts still miss.
- Hard 24h TTL as a safety net against misbehaving origins.
- hooks/SDD-CACHE.md: opt-in setup, end-to-end testing, debugging.
- .gitignore: ignore the .claude/sdd-cache/ directory.

Hooks are opt-in: users register them in .claude/settings.json. The
source-driven-development skill itself is unchanged.

@addyosmani addyosmani left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Much nicer shape than #74! :) moving the caching into hooks keeps the skill itself untouched and makes the whole thing opt-in, which addresses most of what I was hesitant about before. Using HTTP validators (ETag / Last-Modified) is also a clever move, since a 304 response is basically the server telling you "my copy hasn't changed," which is more defensible than trusting a local timestamp.

Your three self-flagged concerns make sesne.

The TTL on top of validators does feel a little redundant to me - if the origin says 304, it's fresh by definition, and the TTL mostly just adds a ceiling for servers that misbehave. I'd consider dropping it, or at least making it really long (a week?) so it's truly just a safety net.

The prompt-normalization key - WebFetch runs the response through a model with that prompt, so two agents asking slightly different questions will get different outputs from the same URL, and your lowercase/whitespace normalization won't catch that. Might be worth keying on URL only and accepting that the cache hits on "the same page" rather than "the same question about the page." HEAD reliability is real but probably fine in practice - most doc sites (MDN, react.dev, caniuse) afaik behave correctly, and a missing validator just means no caching, which is the safe failure mode.

Drop the 24h TTL: HTTP validators are the whole freshness contract.
Key cache on URL alone; prompt-aware keying with normalization gave
false safety (semantic differences slipped through). Prompt is kept
as metadata and surfaced in the hit message so the next agent can
judge whether the earlier reading applies. Reframe docs around
"HTTP resource cache, not prompt cache".

While here, fix two latent bugs:
- Replace the unquoted heredoc in the pre-hook with printf. The
  heredoc expanded $vars and backticks inside cached content, so a
  compromised doc page could trigger command substitution on cache
  hit.
- Strip CR before awk paragraph-mode parsing of curl -I -L output
  so blank separators between response blocks on a redirect chain
  are recognised (was silently picking intermediate headers).

Remove dead -v IGNORECASE=1 (gawk-only; tolower() already handles it).
@federicobartoli

Copy link
Copy Markdown
Contributor Author

Both right, thanks. Dropped the TTL and took prompt out of the key - the normalize trick was giving false safety anyway. Prompt now lives as metadata and shows up in the hit message so the next agent can judge if the earlier reading fits. Docs rewritten around "HTTP resource cache, not prompt cache".

Two things I hit while in there:

  • pre-hook was writing cached content into an unquoted heredoc, so backticks and $vars in code examples would expand on hit (swapped for printf)
  • curl -I -L returns CRLF, which was breaking awk paragraph-mode on redirect chains (tr -d '\r' before awk)

Pushed as 4743df9.

@addyosmani

Copy link
Copy Markdown
Owner

Thanks for the updates, @federicobartoli! Just to check, is the PR still considered in draft/WIP or are you at the stage where you'd like a final pass review for merge consideration?

@federicobartoli federicobartoli marked this pull request as ready for review April 18, 2026 21:00
@federicobartoli

Copy link
Copy Markdown
Contributor Author

Sorry @addyosmani , my bad , I forgot to flip it out of draft after addressing your comments. Just marked it ready for review. Final pass whenever you have time 🙏

@addyosmani addyosmani left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall after doing another pass. Thanks, @federicobartoli!

@addyosmani addyosmani merged commit 44dac80 into addyosmani:main Apr 19, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants