Skip to content

fix: skip tool_result messages in save hook exchange count#550

Open
sha2fiddy wants to merge 5 commits intoMemPalace:developfrom
sha2fiddy:fix/549-skip-tool-result-in-save-hook
Open

fix: skip tool_result messages in save hook exchange count#550
sha2fiddy wants to merge 5 commits intoMemPalace:developfrom
sha2fiddy:fix/549-skip-tool-result-in-save-hook

Conversation

@sha2fiddy
Copy link
Copy Markdown
Contributor

Closes #549

Summary

  • Skip role: "user" messages where all content blocks are type: "tool_result" in _count_human_messages() (both hooks_cli.py and the inline Python in mempal_save_hook.sh)
  • Claude Code sends tool results as role: "user", inflating the exchange count ~2.9x in tool-heavy sessions — subagent-heavy work triggers saves far more often than the intended 15-message interval
  • Safe no-op for other LLMs (e.g. OpenAI-compatible) that use role: "tool" for tool results

Related

Test plan

  • Added test_count_skips_tool_results — verifies single and multi-block tool_result messages are skipped
  • Added test_count_mixed_content_not_skipped — verifies messages with both tool_result and text blocks still count
  • All 34 tests pass

@sha2fiddy sha2fiddy marked this pull request as ready for review April 10, 2026 16:24
@web3guru888
Copy link
Copy Markdown

This is a well-diagnosed fix. The 2.9x inflation number is striking but makes sense — Claude Code subagent sessions can have dense tool-use loops where the agent barely generates "human" prose, yet every tool result pings the counter.

The predicate is correct. Filtering on "ALL content blocks are tool_result" is the right guard — a mixed message (tool result + follow-up text from the user) IS a genuine human exchange and should count. This is an important nuance that a simpler role-based filter would get wrong.

Dual-location fix matters. hooks_cli.py and the inline Python in mempal_save_hook.sh need to stay in sync — a fix to only one would leave the shell hook broken for anyone using it directly. Good that both are covered here.

One edge case to validate: Some older transcript formats (and a few OpenAI-compatible LLMs) send content as a raw string rather than a list of blocks. Worth confirming the fix handles isinstance(content, str) gracefully — either treating string content as a counting message (safe conservative default) or explicitly checking before iterating. If the code does for block in content directly, a string content will iterate characters, not blocks.

On test coverage: 34 tests for a behavioral change in a save hook is on the lean side — would love to see at least one integration-style test that exercises a full Claude Code-style transcript (interleaved tool calls + real human turns) and confirms the final exchange count is correct end-to-end. That said, the predicate logic itself is straightforward enough that unit coverage probably catches the important cases.

Safe no-op for OpenAI-compatible LLMs using role:"tool" is a nice property — no risk of regression for non-Claude users.


[MemPalace-AGI integration — production stats at https://milla-jovovich.github.io/mempalace/integrations/mempalace-agi/]

Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix for a real problem. The 2.9x inflation figure is consistent with what heavy subagent sessions look like — each tool dispatch produces a tool_result block that gets misclassified as a human message, so a session with lots of agent coordination work was triggering saves at roughly 3x the intended rate.

The all(b.get('type') == 'tool_result' for b in content) guard is the right predicate. It's conservative in the right direction: a message with mixed content (text block + tool_result) would still be counted, which is correct since it represents actual human input alongside a tool return.

The dual-site fix (both hooks_cli.py and mempal_save_hook.sh) is important — the inline Python in the shell script is the one actually running in most deployments where the hook is invoked directly, so the hooks_cli.py fix alone wouldn't have been sufficient.

Test coverage is solid. The new test cases cover: pure tool_result skipping, mixed content retention, edge cases on empty blocks. That's exactly the right surface to test.

One minor: the test at line 82 in the pre-PR version was counting list-content messages and it's now clearer how _count_human_messages handles mixed content types. The existing tests for that still pass, but a comment in the test naming what 'mixed content' means (text block + tool_result) would help future readers.

Closes #549 cleanly. LGTM.

@sha2fiddy
Copy link
Copy Markdown
Contributor Author

sha2fiddy commented Apr 13, 2026

Yeah this is handled already — isinstance(content, str) is checked first (line ~59 in _count_human_messages):

if isinstance(content, str):
    if "<command-message>" in content:
        continue
    # counts as human message
elif isinstance(content, list):
    if all(b.get("type") == "tool_result" for b in content):
        continue

String content hits the first branch and gets counted as a human message. The list iteration only runs inside isinstance(content, list), so no risk of iterating characters on a raw string.

@igorls igorls added area/hooks Claude Code hook scripts (Stop, PreCompact, SessionStart) bug Something isn't working labels Apr 14, 2026
@sha2fiddy sha2fiddy force-pushed the fix/549-skip-tool-result-in-save-hook branch 3 times, most recently from 287a955 to e52566a Compare April 20, 2026 13:47
@sha2fiddy sha2fiddy force-pushed the fix/549-skip-tool-result-in-save-hook branch from e52566a to 3f2ac0c Compare April 22, 2026 13:21
@sha2fiddy
Copy link
Copy Markdown
Contributor Author

Rebased on develop (9b35d9f). Conflicts from upstream #1021 resolved — took silent-save shape. 1068/1068 tests pass.

@sha2fiddy sha2fiddy force-pushed the fix/549-skip-tool-result-in-save-hook branch from 3f2ac0c to 4ba3829 Compare April 26, 2026 19:33
@Reebz
Copy link
Copy Markdown

Reebz commented Apr 27, 2026

Overall inflation is 600%+ for me, as the problem scales with tool and agent use. This PR greatly helps to fix this.

I tested the default hook behavior ("OLD") vs PR #550 ("NEW") counter logic over my parent-session Claude Code transcripts only (depth 2, 66 sessions across 31 projects). Note that Claude Code's Stop hook only ever passes the parent transcript path and the recursive mempalace mine walk is a separate downstream step the PR doesn't touch.

Testing:

  • Overall inflation: 6.78x (OLD=19,114 vs. NEW=2,818)
  • Per-session: mean 5.85x, median 5.33x
  • 70% of sessions greater than 2x, 52% greater than 5x

I run a multi-agent dispatch workflow, so my tool_result density is at the high end. The 2.9x in the original issue report is plausibly closer to the median user. My number suggests the bug scales with tool-use intensity (and therefore teams of agents using tools). Either way the structural fix in #550 looks right patched locally and the firing rate now lines up with actual conversation cadence.

The documented "hook triggers every 15 human messages" is not working as intended, its catching lots of agent and tool use messages. I'm glad to have found this PR and related issue and I was wrangling with this the past week.

@sha2fiddy sha2fiddy force-pushed the fix/549-skip-tool-result-in-save-hook branch from 4ba3829 to b841a28 Compare April 29, 2026 21:26
…#549)

Claude Code sends tool results as role: "user" messages with content
blocks of type "tool_result". The save hook was counting these as human
messages, inflating the exchange count ~2.9x in tool-heavy sessions and
triggering saves far more often than the intended 15-message interval.

Filter out entries where content is a list of exclusively tool_result
blocks. Safe no-op for LLMs that use role: "tool" for tool results.
@sha2fiddy sha2fiddy force-pushed the fix/549-skip-tool-result-in-save-hook branch from b841a28 to 9f35b71 Compare April 29, 2026 23:02
@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, _count_human_messages() will incorrectly skip messages where content is an empty list, because all(...) returns True on empty iterables; this undercounts human exchanges and can delay/disable save triggers.

Severity: action required | Category: correctness

How to fix: Require non-empty content list

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

all(...) is vacuously true for empty iterables, so a user message with content: [] is incorrectly skipped as tool-only, undercounting human exchanges.

Issue Context

This affects both implementations of exchange counting: the Python hook (mempalace/hooks_cli.py) and the inline Python in the shell hook (hooks/mempal_save_hook.sh).

Fix Focus Areas

  • mempalace/hooks_cli.py[118-130]
  • hooks/mempal_save_hook.sh[147-161]

Suggested change

Guard the all(...) check with a non-empty condition, e.g.:

  • if content and all(...): continue
    (or equivalently if len(content) > 0 and all(...): continue).

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Qodo code review - free for open-source.

@sha2fiddy
Copy link
Copy Markdown
Contributor Author

Fixed in ac65254. Both mempalace/hooks_cli.py and the inline Python in hooks/mempal_save_hook.sh now check content directly before falling into the all(...) branch, so content=[] is skipped explicitly instead of by accident.

Regression test: tests/test_hooks_cli.py::test_count_skips_empty_content_list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hooks Claude Code hook scripts (Stop, PreCompact, SessionStart) bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: save hook counts tool_result messages as human messages, inflating exchange count

5 participants