fix: fast-fail on_load_session when thread ID is missing or wrong#8543
Open
fix: fast-fail on_load_session when thread ID is missing or wrong#8543
Conversation
When the UI sends a legacy session ID (e.g. "20260415_1") instead of a thread UUID, on_load_session previously hung indefinitely because get_thread returned an error that was silently swallowed or never reached the client. This caused a 30-second frontend timeout and a blank conversation view. Now the handler checks whether the requested ID corresponds to a known session in the sessions table, and returns an immediate, diagnostic error explaining what went wrong: - "Session X has linked thread Y, but was sent as the session_id instead of the thread UUID" — tells the engineer the UI needs to send the thread UUID instead. - "Session X exists but has no linked thread" — the session was not fully created via ACP (thread creation failed or was skipped). - "Session not found" — the ID doesn't match anything. Also bumps tauri-plugin-dialog to "2" to fix a version mismatch warning with @tauri-apps/plugin-dialog v2.7. Signed-off-by: morgmart <98432065+morgmart@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Category: fix
User Impact: Sessions that previously hung for 30 seconds and showed a blank screen now fail immediately with a clear error message.
Problem: When clicking on certain sessions in the Goose2 app, the conversation view hangs for 30 seconds, then shows a blank screen. This happens because the UI sometimes sends a legacy session ID (like
20260415_1) instead of a thread UUID to theload_sessionRPC. The goose binary'son_load_sessionhandler tries to look up a thread by that ID, fails, and the error either hangs or is silently swallowed — the user never sees what went wrong.Solution: Replace the thread lookup's silent failure with a diagnostic check. When
get_threadfails, we now check whether the requested ID is a known session in the sessions table, and return an immediate error with a specific message explaining the mismatch. This turns a 30-second hang into an instant, actionable error.Root Cause Investigation
This fix addresses the symptom. The investigation uncovered three deeper issues that this PR does not fix but documents here for the ACP engineer:
1. UI sends date-based session IDs instead of thread UUIDs
The
loadSessions()function inchatSessionStore.tshas a fallback path (catch block) that loads sessions from localStorage overlays whenacpListSessions()fails. These overlays use date-based session IDs (20260415_1) as their key instead of thread UUIDs. Once loaded this way, subsequent session clicks send the wrong ID to the backend.Evidence: Timing logs showed
session=20260412_8 goose=20260412_8— the date-based ID being sent for sessions that have valid thread UUIDs in the database.2. Some ACP sessions have no linked thread
Session
20260415_1("Removing Left Nav Divider Lines") was created withsession_type=acpandprovider_name=claude-acp, but itsthread_idcolumn is NULL. Out of 89 ACP sessions in the database, 14 have no thread ID. The thread creation either failed silently or was skipped during session creation.Database evidence:
3. Session loading RPC takes 18-38 seconds (even for valid sessions)
Even sessions with valid threads take 18-38 seconds to load via the RPC. The message replay itself is < 2ms — the time is spent elsewhere in the goose binary (likely provider initialization). The installed binary may differ from the current source (which defers agent setup via
spawn_agent_setup).Timing evidence:
File changes
crates/goose-acp/src/server.rs
Replaced the thread lookup in
on_load_sessionwith a match block that provides diagnostic fallback. Whenget_threadfails, it checks the sessions table to determine if the ID is a legacy session ID, and returns a specific error message explaining whether the session has a linked thread (wrong ID was sent) or has no thread at all (incomplete ACP creation).ui/goose2/src-tauri/Cargo.toml
Relaxed
tauri-plugin-dialogversion constraint from">=2,<2.7"to"2"to fix a recurring version mismatch warning with@tauri-apps/plugin-dialogv2.7.Reproduction Steps
Next Steps for the ACP Engineer
acpSessionId(the thread UUID) is preserved through the localStorage fallback path inchatSessionStore.tscreate_internal_sessionflow should be audited for silent failures during thread creationon_load_session— the 18-38s RPC time suggests the installed binary may not have the asyncspawn_agent_setupoptimization that exists in the current source