feat: implement multi-process LSP server architecture#364
Conversation
Add src/server/protocol.h with all worker message structs (CompileParams, HoverParams, SemanticTokensParams, etc.) and their RequestTraits/ NotificationTraits specializations. Feature request results use RawValue for transparent JSON forwarding. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add stateless worker that handles one-shot compilation tasks (BuildPCH, BuildPCM, Completion, SignatureHelp, Index) via bincode IPC over stdio. Each task is dispatched to the thread pool via eventide::queue(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add stateful worker that holds compiled ASTs and handles feature requests (hover, semantic tokens, inlay hints, folding ranges, document symbols, document links). Uses per-document mutex for AST access serialization and LRU-based document eviction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add WorkerPool that spawns and manages stateless/stateful worker processes. Stateless workers use round-robin dispatch; stateful workers use path_id affinity routing with LRU tracking and least-loaded worker selection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add MasterServer with LSP lifecycle state machine (initialize, shutdown, exit), ServerPathPool for path interning, DocumentState management, and incremental text sync. Wire up clice.cc mode dispatch for pipe, stateless-worker, and stateful-worker modes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Manages PCM/PCH compilation with cascade invalidation and cancellation support via eventide's cancellation_source/token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- schedule_build with 200ms debounce timer per document - run_build_drain coroutine with generation-checked diagnostics - Feature requests routed to stateful workers via RawValue passthrough - Completion/signatureHelp routed to stateless workers - didChange triggers compile_graph cascade invalidation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- load_workspace searches for compile_commands.json and scans all files - Bidirectional include graph (forward/backward) built via scan_fuzzy - didOpen registers compile units with dependency edges - didChange rescans files to update include graph - fill_compile_args helper populates worker params from CDB Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- 3-second idle timer triggers background indexing - Index queue populated from all CDB entries after workspace load - Sends IndexParams to stateless workers and receives IndexResult - TUIndex merge into ProjectIndex left as TODO pending data flow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- 30s request timeout for all worker IPC requests - compile_graph.cancel_all() for shutdown cleanup - Graceful exit: cancel compilations, stop workers, then stop loop - stderr log collection from workers with [SL-N]/[SF-N] prefixes - Build drain clears diagnostics on compile failure - Background indexer skips files with failed dependencies Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CliceConfig struct with auto-computed defaults from CPU count - Loads from clice.toml or .clice/config.toml in workspace - Uses eventide serde::toml for deserialization - Configurable debounce, idle timeout, worker counts, memory limit - Creates cache directory on startup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Python integration tests validate LSP lifecycle (initialize, shutdown, didOpen/didChange, hover, diagnostics) via the existing LSPClient fixture. C++ unit tests validate worker subprocess spawning and BincodePeer IPC round-trips for both stateless (BuildPCH, Index) and stateful (Compile, Hover, DocumentUpdate, CodeAction, GoToDefinition) workers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add SIGPIPE signal ignore guard so broken pipes from exited worker processes don't kill the test binary - Make CompileThenHover and DocumentUpdate tests resilient to compile failures (files don't exist on disk, only test IPC round-trip) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ests - Run pixi run format across all modified source files - Fix integration test infrastructure: send initialized notification, add rootUri to initialize params, auto-generate compile_commands.json - Fix URI percent-decoding using eventide::ipc::lsp::URI helper - Fix worker pool CLI argument format (space-separated instead of =) - Add pool.start() call in initialized handler to spawn workers - Split monolithic worker_tests.cpp into stateful_worker_tests.cpp, stateless_worker_tests.cpp, and shared worker_test_helpers.h - Replace silent test failures with CO_ASSERT_TRUE for strict assertions - All 206 unit tests and 30 integration tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Describe the multi-process master/worker architecture, including request routing, build drain, compile graph, configuration, and the IPC protocol between master and worker processes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces a multi-process LSP server architecture, replacing the stub implementation with a master server orchestrating stateful and stateless worker pools via IPC. It adds CLI argument parsing, workspace configuration loading, document lifecycle management, debounced compilation, and complete protocol/namespace migration from language-based to eventide IPC/LSP equivalents. Changes
Sequence Diagram(s)sequenceDiagram
actor Client as LSP Client
participant EvtLoop as Event Loop
participant Master as Master Server
participant StatefulPool as Stateful<br/>Worker Pool
participant StatelessPool as Stateless<br/>Worker Pool
participant StatefulW as Stateful<br/>Worker Process
participant StatelessW as Stateless<br/>Worker Process
Client->>Master: textDocument/didOpen
Master->>Master: Update doc state<br/>(version, text, gen++)
Master->>EvtLoop: Schedule debounce timer
Master->>StatefulPool: notify_stateful(DocumentUpdate)
StatefulPool->>StatefulW: DocumentUpdate notification
EvtLoop->>Master: Debounce timer fires
Master->>Master: schedule_build()
Master->>StatefulPool: send_stateful(Compile)
StatefulPool->>StatefulW: Compile request
StatefulW->>StatefulW: Lock doc strand<br/>Compile unit<br/>Mark AST ready
StatefulW->>Master: CompileResult (diagnostics)
Master->>Master: publish_diagnostics
Master->>Client: textDocument/publishDiagnostics
Client->>Master: textDocument/hover
Master->>StatefulPool: send_stateful(HoverParams)
StatefulPool->>StatefulW: HoverParams request
StatefulW->>StatefulW: with_ast() check<br/>Compute feature
StatefulW->>Master: RawValue result
Master->>Client: hover response
Client->>Master: textDocument/completion
Master->>StatelessPool: send_stateless(CompletionParams)
StatelessPool->>StatelessW: CompletionParams request
StatelessW->>StatelessW: Thread pool:<br/>code_complete()
StatelessW->>Master: RawValue result
Master->>Client: completion response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all spdlog::info/warn/error/debug calls with LOG_INFO/LOG_WARN/ LOG_ERROR/LOG_DEBUG macros from support/logging.h for lazy evaluation and consistent source location tracking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (1)
src/server/protocol.h-3-13 (1)
3-13:⚠️ Potential issue | 🟡 MinorAdd direct
<string_view>include to avoid transitive-include breakage.
std::string_viewis used extensively from line 165 onward (17 occurrences), but<string_view>is not included directly in this header. This can break on stricter toolchains when transitive includes change.Proposed fix
`#include` <cstdint> `#include` <optional> `#include` <string> +#include <string_view> `#include` <unordered_map> `#include` <utility> `#include` <vector>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/protocol.h` around lines 3 - 13, This header uses std::string_view in many places but doesn't include <string_view>, causing fragile transitive-include dependencies; add a direct include for <string_view> to the top include block of the protocol.h header so the type is guaranteed available (ensure the new include sits with the other standard library includes and rebuilds cleanly where std::string_view is used).
🧹 Nitpick comments (7)
.gitignore (2)
70-72: Consider organizing these entries under a category.The new ignore patterns (
.codex/,.claude/,openspec/) are added at the end without a category comment. For better maintainability, consider grouping them under an existing section (e.g., "IDEs & Editors") or adding a new category comment.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore around lines 70 - 72, Group the new gitignore entries into a clear category comment: move `.codex/`, `.claude/`, and `openspec/` under an appropriate existing section (e.g., "IDEs & Editors" or "Tooling") or add a new category header like `# AI tools / generated` above them; update the ordering so they sit with related patterns and keep the three exact patterns (`.codex/`, `.claude/`, `openspec/`) intact.
70-72: Redundant ignore pattern for.claude.Line 71 adds
.claude/but line 62 already contains.claude(without trailing slash). In gitignore syntax,.claudematches both files and directories named.claude, making the.claude/entry redundant.♻️ Suggested cleanup
Option 1: Remove the redundant
.claude/entry:.codex/ -.claude/ openspec/Option 2: If you want to keep the trailing-slash convention for clarity, remove
.claudefrom line 62 and keep only.claude/here:# At line 62: -.claude .clangd # ... keep line 71 as-is🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore around lines 70 - 72, Redundant .gitignore entry: remove the duplicate gitignore pattern by keeping only one form of the `.claude` rule; either delete the `.claude/` entry added on line with `.claude/` or remove the earlier `.claude` entry so you consistently use the trailing-slash convention—update the `.gitignore` to contain only a single `.claude` pattern (either `.claude` or `.claude/`) to avoid duplication.src/feature/document_links.cpp (1)
10-10: Remove the empty anonymous namespace.This empty
namespace {}is dead code left over from removing the localprotocolalias andto_rangehelper (now provided byfeature/feature.h).🧹 Proposed fix
`#include` "feature/feature.h" namespace clice::feature { -namespace {} // namespace - auto document_links(CompilationUnitRef unit, PositionEncoding encoding)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/feature/document_links.cpp` at line 10, Remove the dead empty anonymous namespace by deleting the lone "namespace {}" declaration in document_links.cpp; this orphaned namespace was leftover after removing the local protocol alias and to_range helper (now in feature/feature.h), so simply remove that line to eliminate the no-op anonymous namespace and ensure there are no other references to it in the file.tests/conftest.py (1)
66-82: Consider generating the compile_commands.json in a temp directory or checking it into source control.The fixture writes
compile_commands.jsondirectly into the source tree (tests/data/hello_world/). This could cause issues with:
- Version control showing untracked/modified files after test runs
- Parallel test execution if multiple processes write simultaneously
Consider either:
- Checking in a static
compile_commands.jsonwith the test data- Generating it in a temporary copy of the test directory
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/conftest.py` around lines 66 - 82, The fixture currently writes compile_commands.json directly into the source test data (variables data_dir, hw_dir, main_cpp, cdb_path); change it to avoid mutating the repo by copying the hello_world test folder into a temporary directory (use pytest's tmp_path or tmp_path_factory), update hw_dir to point at the temp copy, then generate cdb_path inside that temp copy; alternatively, replace the runtime generation with a checked-in static compile_commands.json in the test data and remove the runtime write logic.tests/integration/test_server.py (1)
111-113: Several “feature coverage” tests still pass on stubbed handlers.These paths await the response but never validate it, so the suite stays green if completion, definition, code action, or inlay hints return
null/[]unconditionally. Use deterministic positions/fixtures and assert at least the expected container or object shape.Also applies to: 186-188, 197-199, 208-215, 266-276, 285-296, 412-478
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_server.py` around lines 111 - 113, Currently tests like the client.hover call accept null responses which lets stubbed handlers pass; update these assertions to use deterministic fixture positions and validate that the returned shape is present (not null) and minimally populated: for client.hover(...) assert result is not None and has a 'contents' field (or equivalent non-empty payload); for client.completion(...) assert result is not None, has an 'items'/container and at least one item with a 'label'; for client.definition(...) assert the response is a non-empty list or location object with a 'uri' and 'range'; for client.code_action(...) assert a non-empty list with items containing a 'title'; for client.inlay_hints(...) assert a list with elements having 'label' and 'position'. Apply these shape checks to the other occurrences noted (lines ~186-188, 197-199, 208-215, 266-276, 285-296, 412-478) by replacing permissive null-acceptance with deterministic position fixtures and the above minimal structural assertions.src/server/protocol.h (2)
30-36: Prefer fixed-width integer for IPC wire fields.On Line 35,
std::size_t memory_usageis ABI-width dependent. For serialized IPC payloads, use a fixed-width type (e.g.,std::uint64_t) to keep schema behavior explicit across targets.Proposed refactor
struct CompileResult { std::string uri; int version; /// Diagnostics serialized as JSON (RawValue) to avoid bincode/serde annotation conflicts. eventide::serde::RawValue diagnostics; - std::size_t memory_usage; + std::uint64_t memory_usage; };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/protocol.h` around lines 30 - 36, The CompileResult struct uses an ABI-dependent std::size_t for the memory_usage IPC field; change the type to a fixed-width unsigned integer (std::uint64_t) and include the <cstdint> header, i.e., update the memory_usage member in struct CompileResult to std::uint64_t and add the appropriate header include, then update any serialization/deserialization, tests, and call sites that read/write CompileResult to use the new fixed-width type (refer to symbol CompileResult and field memory_usage).
20-28: Deduplicate shared compile-context fields across request payloads.
CompileParams,CompletionParams, andSignatureHelpParamsrepeat the same compile-context members. Extracting a shared struct reduces drift when protocol fields evolve.Possible direction
+struct CompileContext { + std::string uri; + int version; + std::string text; + std::string directory; + std::vector<std::string> arguments; + std::pair<std::string, uint32_t> pch; + std::unordered_map<std::string, std::string> pcms; +}; + struct CompileParams { - std::string uri; - int version; - std::string text; - std::string directory; - std::vector<std::string> arguments; - std::pair<std::string, uint32_t> pch; - std::unordered_map<std::string, std::string> pcms; + CompileContext context; }; struct CompletionParams { - std::string uri; - int version; - std::string text; - std::string directory; - std::vector<std::string> arguments; - std::pair<std::string, uint32_t> pch; - std::unordered_map<std::string, std::string> pcms; + CompileContext context; int line; int character; }; struct SignatureHelpParams { - std::string uri; - int version; - std::string text; - std::string directory; - std::vector<std::string> arguments; - std::pair<std::string, uint32_t> pch; - std::unordered_map<std::string, std::string> pcms; + CompileContext context; int line; int character; };Also applies to: 78-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/protocol.h` around lines 20 - 28, Several request structs (CompileParams, CompletionParams, SignatureHelpParams) duplicate the same compile-context fields; extract those fields into a new shared struct (e.g., CompileContext) containing uri, version, text, directory, arguments, pch, and pcms, then replace the repeated fields in CompileParams, CompletionParams, and SignatureHelpParams with either a CompileContext member or have those structs inherit from CompileContext; update any code that constructs or accesses the old fields to use the new CompileContext member names (or base class) to keep existing call sites consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmake/package.cmake`:
- Around line 41-56: The FetchContent_Declare call for eventide currently uses
the branch name "main" (GIT_TAG main), which makes builds non-reproducible;
update the FetchContent_Declare block for the symbol eventide to pin a stable
release tag or commit SHA instead of "main" (replace GIT_TAG main with the
chosen release tag or commit hash) so dependency resolution is deterministic;
keep the rest of the FetchContent_Declare settings (GIT_REPOSITORY, GIT_SHALLOW)
and leave FetchContent_MakeAvailable(eventide spdlog croaring flatbuffers)
unchanged.
In `@src/clice.cc`:
- Around line 20-41: The CLI flags mode, host, port, stateful_worker_count,
stateless_worker_count, and worker_memory_limit are parsed but not acted on;
update the dispatch logic that reads mode (the branch below the parsed fields)
to handle "socket" by using host and port, and to handle "stateless-worker" and
"stateful-worker" modes by applying stateless_worker_count,
stateful_worker_count and worker_memory_limit when spawning/initializing workers
(ensure defaults are respected when counts/limits are unset). Also remove or
replace the current path that emits "unknown mode" for documented modes and
propagate these parsed values into whichever functions create the pipe/socket
server or worker pool (referencing the code that constructs/starts the pipe
server and worker initialization routines so they accept and use host, port,
stateful_worker_count, stateless_worker_count, and worker_memory_limit).
In `@src/server/compile_graph.cpp`:
- Around line 7-13: In CompileGraph::register_unit you overwrite
unit.dependencies but never remove this unit from the old dependencies'
dependents lists, leaving stale reverse edges and allowing duplicates; before
assigning the new deps on units[path_id] remove path_id from each old
dependency's dependents (e.g. iterate over unit.dependencies and erase/remove
path_id from units[old_dep].dependents), then assign the new dependencies and
when updating units[dep_id].dependents only append path_id if it's not already
present (or deduplicate after insertion) to avoid duplicate entries.
In `@src/server/master_server.cpp`:
- Around line 223-227: Dependency rescans are calling scan_fuzzy/scan_file with
an empty content payload so they read from disk instead of the in-memory buffer;
update the calls (e.g., the scan_fuzzy invocation and the scan_file
implementation) to pass the current in-memory file buffer (the open/edited
document content from the LSP cache or ctx/document object) instead of
{}/hard-coded empty content, and ensure the refreshed dependency set is rebuilt
from that provided content before invalidation so include/compile graph updates
reflect unsaved edits.
- Around line 444-448: WorkerPoolOptions.pool_opts never receives the configured
memory cap so workers use the default; set pool_opts.worker_memory_limit =
config.worker_memory_limit before calling pool.start() so the configured limit
is propagated (update the block that constructs WorkerPoolOptions and calls
pool.start() — symbols: WorkerPoolOptions, pool_opts, worker_memory_limit,
config.worker_memory_limit, pool.start()).
- Around line 530-578: After applying the content changes and incrementing
doc.generation, send a worker::DocumentUpdateParams notification to the owning
stateful worker so it knows the open buffer changed; build the params using
params.text_document.uri, doc.version, doc.text and doc.generation and dispatch
it for path_id (i.e., notify the worker that owns path_id). Place this send
immediately after doc.generation++ (before
reset_idle_timer()/scan_file()/compile_graph.update()/schedule_build()) and
ensure the send is non-blocking or posted to the existing worker notification
mechanism so hover/definition logic will mark the AST dirty.
- Around line 431-450: The handler currently sets lifecycle =
ServerLifecycle::Ready before attempting to start the worker pool, so if
pool.start(pool_opts) fails the server remains marked Ready; change the flow in
the peer.on_notification lambda (the closure that handles
protocol::InitializedParams) to defer promoting lifecycle to Ready until after a
successful pool.start(pool_opts), and if pool.start(...) returns false revert
lifecycle (or set it to an error/NotReady state) and return/handle the failure
path (log the error as already done). Ensure references are to lifecycle,
ServerLifecycle::Ready, the peer.on_notification lambda, and
pool.start(pool_opts) so you adjust the order/logic around promotion to Ready
and failure handling.
- Around line 95-126: The code currently holds a reference auto& doc =
doc_it->second across the suspension point co_await
compile_graph.compile_deps(...), which can be invalidated if documents is
modified; instead, do not retain that reference across the await: either copy
out the primitive fields you need (uri, version, text, generation,
directory/arguments via path_pool.resolve(path_id)) into local variables before
calling compile_graph.compile_deps, or drop the early reference and re-find
documents.find(path_id) immediately after the co_await (as the existing code
already does for the deps-failure branch) and then operate on that fresh
reference; ensure worker::CompileParams is populated from those safe locals or
the re-fetched doc before calling fill_compile_args and sending the compile
request.
- Around line 545-562: The pos_to_offset lambda incorrectly treats
Position.character as a byte offset by returning i + pos.character (mixing
UTF-16 units with UTF-8 bytes); replace this ad-hoc converter with the codepath
used elsewhere that maps UTF-16 code units to UTF-8 byte offsets (e.g., use
PositionMapper or PositionEncoding from feature.h / PositionMapper methods) to
compute start and end offsets from range.start and range.end before calling
doc.text.replace; ensure the mapping accounts for lines and UTF-16 code unit
indices (use the same helper used in other parts of the codebase rather than
manual byte iteration).
In `@src/server/stateful_worker.cpp`:
- Around line 99-145: The code updates document state (doc.version, doc.text,
doc.arguments, doc.pch, doc.pcms) without holding the per-document strand,
allowing races; fix by taking the doc.strand lock before modifying doc or
instead snapshot all needed fields under the strand and use that immutable
snapshot for the background compile. Specifically, in the block around
get_or_create/touch_lru and the compile lambda (and in DocumentUpdate), acquire
co_await doc.strand.lock() first, then either (A) perform mutations to doc there
and enqueue the compile using a snapshot of the new state, or (B) copy params
into local variables (version, text, arguments, pch, pcms) while holding the
strand, release the lock, run compile on the snapshot, and after compile
re-acquire the strand to store doc.unit/has_ast/dirty only if doc.version still
matches the snapshot version; ensure result.version is taken from the snapshot
to avoid reporting a newer revision than the diagnostics produced.
- Around line 389-402: The CodeAction and GoToDefinition request handlers
(peer.on_request lambdas handling CodeActionParams and GoToDefinitionParams)
currently always return et::serde::RawValue{"[]"}, which falsely advertises
codeActionProvider/definitionProvider support; either implement them to consult
the editor/document state and produce real results or stop advertising the
capabilities until implemented. Fix by locating the two handlers (the lambda for
CodeActionParams and the lambda for GoToDefinitionParams) and replace the
hard-coded RawValue return with real logic that looks up the document/position
from your existing document manager/state APIs and builds the proper response
objects (serialized via et::serde) OR remove/disable the corresponding
capability flags in the server capability registration so clients won’t expect
providers that return empty arrays. Ensure the change references the exact
handlers (CodeActionParams, GoToDefinitionParams, peer.on_request) so the
advertised capabilities and handlers stay consistent.
- Around line 66-76: Evicting by immediately erasing DocumentEntry from
documents causes use-after-free for in-flight work; change the eviction to: mark
the entry as evicted (add an evicted flag or state) and remove it from lru and
lru_index and call peer.send_notification(EvictedParams) but do NOT erase the
DocumentEntry object until no in-flight users remain; convert documents to hold
shared_ptr<DocumentEntry> (or add an in-flight counter on DocumentEntry) so
background coroutines/lambdas can keep a safe reference, then defer actual
removal by moving the shared_ptr into a pending-deletes list and only erase when
use_count==1 (or counter==0); apply the same pattern to the other eviction path
referenced around the lines with the second erase call so both
shrink_if_over_limit() and that path avoid immediate erase.
- Around line 54-64: The LRU currently stores non-owning llvm::StringRef values
(lru, lru_index) via touch_lru(uri), which can dangle when the request string
(params.uri) goes out of scope; change the LRU to own its keys so entries remain
valid: replace the list of llvm::StringRef with a list of owned strings (e.g.,
std::string) and update lru_index to map keys to iterators accordingly (or
ensure touch_lru inserts a reference into the stable documents storage instead
of the transient params.uri); update touch_lru, lru, and lru_index usage to
construct/compare using the owned string type so evictions no longer reference
freed memory.
In `@src/server/stateless_worker.cpp`:
- Around line 45-60: Add an output_path string field to the BuildPCHResult and
BuildPCMResult structs and populate it from the local pch_info.path /
pcm_info.path before returning; specifically, after compile(cp, pch_info) (and
compile(cp, pcm_info) in the other block) and before constructing/returning the
result, set result.output_path = pch_info.path (or pcm_info.path) so callers
receive the generated .pch/.pcm file location instead of losing the
temporary-file path when the local info objects go out of scope.
In `@src/server/worker_pool.cpp`:
- Around line 71-107: The current start()/spawn_worker() flow schedules
drain_stderr(...) and w.peer->run() immediately after each successful spawn, so
if a later spawn fails earlier child processes and coroutines are left running;
to fix, change spawn logic to defer scheduling side-effects until all spawns
succeed (e.g., collect Spawn results, transports, peers and only call
loop.schedule(drain_stderr(...)) and loop.schedule(peer->run()) after all
workers are created), or on any spawn error unwind by invoking stop() to cleanly
terminate/close already-created WorkerProcess entries (proc, transport/peer) and
cancel scheduled coroutines; adjust spawn_worker()/start() and WorkerProcess
construction accordingly to ensure no live subprocesses or scheduled coroutines
remain on early-return.
In `@tests/integration/test_file_operation.py`:
- Around line 39-60: Move the diagnostics_received.clear() to before you call
client.did_change/client.did_save so the event is reset before the rebuild is
triggered (avoid lost-wakeup when publishDiagnostics arrives quickly); also
update on_diagnostics to verify the notification belongs to "main.cpp" (inspect
params.uri or diagnostic entries) before calling diagnostics_received.set() so
diagnostics for other files are ignored; adjust references:
diagnostics_received, on_diagnostics, client.did_change, client.did_save,
client.register_notification_handler, and client.get_file.
In `@tests/integration/test_server.py`:
- Around line 85-97: The publishDiagnostics handler currently unblocks on the
first diagnostics for any file; update the on_diagnostics callback to check the
diagnostics URI and only call diagnostics_received.set() and append to
received_diagnostics when params.get("uri") (or params["uri"]) matches the test
file's URI (e.g. the workspace/main.cpp URI used in client.did_open). Apply the
same URI-filtering change to the other two publishDiagnostics handlers
referenced (and to test_diagnostics_received so it only appends params after the
URI check) so the waits only trigger for diagnostics from the file under test.
In `@tests/unit/server/stateless_worker_tests.cpp`:
- Around line 103-105: The test currently only asserts that the RPC round-trip
returned (using EXPECT_TRUE(result.has_value())) but does not verify the worker
operation succeeded; update the assertion after the co_await
w.peer->send_request(params) call to inspect the returned payload (e.g., check
result->success or equivalent success flag, or assert a minimal expected payload
shape such as non-empty features or non-null response) instead of only
has_value(); update the checks in the same way for the other occurrences (around
lines with BuildPCH/BuildPCM/Index tests) so failures in worker logic cause test
failures rather than silent RPC success.
In `@tests/unit/server/worker_test_helpers.h`:
- Around line 47-50: The TempFile constructor currently writes to a
deterministic path (path = "/tmp/clice_test_"+name) which causes collisions when
tests run in parallel; update TempFile (the constructor shown) to create unique
temp file paths (e.g., use std::filesystem::temp_directory_path() plus a unique
suffix composed of process id and a monotonic counter or a secure random string,
or use mkstemp/tmpnam safely) and apply the same change to the other helper that
creates deterministic stderr log files (the other TempFile/stderr-log helper
referenced in the comment) so every test process writes to a unique /tmp
filename.
In `@tests/unit/unit_tests.cc`:
- Around line 23-31: The CLI parse result from
deco::cli::parse<TestOptions>(args) is currently ignored; change the logic to
check parsed.has_value() and if it is false, print the parse error and exit with
code 1 instead of proceeding with an empty filter. Locate the variable parsed
(the result of deco::cli::parse) and mirror the pattern in src/clice.cc: when
parsed.has_value() is false retrieve/format the error, write an explanatory
message to stderr (or via the same logging used elsewhere), and return/exit(1);
otherwise continue to extract parsed->options.test_filter and call
eventide::zest::Runner::instance().run_tests(filter).
---
Minor comments:
In `@src/server/protocol.h`:
- Around line 3-13: This header uses std::string_view in many places but doesn't
include <string_view>, causing fragile transitive-include dependencies; add a
direct include for <string_view> to the top include block of the protocol.h
header so the type is guaranteed available (ensure the new include sits with the
other standard library includes and rebuilds cleanly where std::string_view is
used).
---
Nitpick comments:
In @.gitignore:
- Around line 70-72: Group the new gitignore entries into a clear category
comment: move `.codex/`, `.claude/`, and `openspec/` under an appropriate
existing section (e.g., "IDEs & Editors" or "Tooling") or add a new category
header like `# AI tools / generated` above them; update the ordering so they sit
with related patterns and keep the three exact patterns (`.codex/`, `.claude/`,
`openspec/`) intact.
- Around line 70-72: Redundant .gitignore entry: remove the duplicate gitignore
pattern by keeping only one form of the `.claude` rule; either delete the
`.claude/` entry added on line with `.claude/` or remove the earlier `.claude`
entry so you consistently use the trailing-slash convention—update the
`.gitignore` to contain only a single `.claude` pattern (either `.claude` or
`.claude/`) to avoid duplication.
In `@src/feature/document_links.cpp`:
- Line 10: Remove the dead empty anonymous namespace by deleting the lone
"namespace {}" declaration in document_links.cpp; this orphaned namespace was
leftover after removing the local protocol alias and to_range helper (now in
feature/feature.h), so simply remove that line to eliminate the no-op anonymous
namespace and ensure there are no other references to it in the file.
In `@src/server/protocol.h`:
- Around line 30-36: The CompileResult struct uses an ABI-dependent std::size_t
for the memory_usage IPC field; change the type to a fixed-width unsigned
integer (std::uint64_t) and include the <cstdint> header, i.e., update the
memory_usage member in struct CompileResult to std::uint64_t and add the
appropriate header include, then update any serialization/deserialization,
tests, and call sites that read/write CompileResult to use the new fixed-width
type (refer to symbol CompileResult and field memory_usage).
- Around line 20-28: Several request structs (CompileParams, CompletionParams,
SignatureHelpParams) duplicate the same compile-context fields; extract those
fields into a new shared struct (e.g., CompileContext) containing uri, version,
text, directory, arguments, pch, and pcms, then replace the repeated fields in
CompileParams, CompletionParams, and SignatureHelpParams with either a
CompileContext member or have those structs inherit from CompileContext; update
any code that constructs or accesses the old fields to use the new
CompileContext member names (or base class) to keep existing call sites
consistent.
In `@tests/conftest.py`:
- Around line 66-82: The fixture currently writes compile_commands.json directly
into the source test data (variables data_dir, hw_dir, main_cpp, cdb_path);
change it to avoid mutating the repo by copying the hello_world test folder into
a temporary directory (use pytest's tmp_path or tmp_path_factory), update hw_dir
to point at the temp copy, then generate cdb_path inside that temp copy;
alternatively, replace the runtime generation with a checked-in static
compile_commands.json in the test data and remove the runtime write logic.
In `@tests/integration/test_server.py`:
- Around line 111-113: Currently tests like the client.hover call accept null
responses which lets stubbed handlers pass; update these assertions to use
deterministic fixture positions and validate that the returned shape is present
(not null) and minimally populated: for client.hover(...) assert result is not
None and has a 'contents' field (or equivalent non-empty payload); for
client.completion(...) assert result is not None, has an 'items'/container and
at least one item with a 'label'; for client.definition(...) assert the response
is a non-empty list or location object with a 'uri' and 'range'; for
client.code_action(...) assert a non-empty list with items containing a 'title';
for client.inlay_hints(...) assert a list with elements having 'label' and
'position'. Apply these shape checks to the other occurrences noted (lines
~186-188, 197-199, 208-215, 266-276, 285-296, 412-478) by replacing permissive
null-acceptance with deterministic position fixtures and the above minimal
structural assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e3749705-13bd-43b2-83ea-f433917d1759
📒 Files selected for processing (50)
.github/workflows/test-cmake.yml.gitignoreCMakeLists.txtcmake/package.cmakedocs/en/architecture.mdsrc/clice.ccsrc/compile/diagnostic.cppsrc/compile/toolchain.cppsrc/feature/code_completion.cppsrc/feature/diagnostics.cppsrc/feature/document_links.cppsrc/feature/document_symbols.cppsrc/feature/feature.hsrc/feature/folding_ranges.cppsrc/feature/formatting.cppsrc/feature/hover.cppsrc/feature/inlay_hints.cppsrc/feature/semantic_tokens.cppsrc/feature/signature_help.cppsrc/semantic/resolver.cppsrc/server/compile_graph.cppsrc/server/compile_graph.hsrc/server/config.cppsrc/server/config.hsrc/server/master_server.cppsrc/server/master_server.hsrc/server/protocol.hsrc/server/stateful_worker.cppsrc/server/stateful_worker.hsrc/server/stateless_worker.cppsrc/server/stateless_worker.hsrc/server/worker_pool.cppsrc/server/worker_pool.htests/conftest.pytests/fixtures/client.pytests/integration/test_file_operation.pytests/integration/test_server.pytests/unit/feature/code_completion_tests.cpptests/unit/feature/document_link_tests.cpptests/unit/feature/document_symbol_tests.cpptests/unit/feature/folding_range_tests.cpptests/unit/feature/hover_tests.cpptests/unit/feature/inlay_hint_tests.cpptests/unit/feature/semantic_tokens_tests.cpptests/unit/feature/signature_help_tests.cpptests/unit/server/stateful_worker_tests.cpptests/unit/server/stateless_worker_tests.cpptests/unit/server/worker_test_helpers.htests/unit/test/tester.htests/unit/unit_tests.cc
💤 Files with no reviewable changes (8)
- src/feature/signature_help.cpp
- src/feature/hover.cpp
- src/feature/code_completion.cpp
- src/feature/formatting.cpp
- src/feature/document_symbols.cpp
- src/feature/folding_ranges.cpp
- src/feature/semantic_tokens.cpp
- src/feature/inlay_hints.cpp
Listen on TCP host:port and accept one client connection. Usage: clice --mode socket --host 127.0.0.1 --port 50051 Also: - Remove #include <iostream>, use std::ostringstream for help output - Replace llvm/Support/FileSystem.h with support/filesystem.h - Update test fixture to spawn server process in socket mode Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all granular eventide/async/* includes with the single umbrella header eventide/async/async.h. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Guard SIGPIPE, ::open(), ::close() behind #ifndef _WIN32 - Include <fcntl.h> and <unistd.h> only on non-Windows - Use llvm::sys::path::system_temp_directory for cross-platform temp paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (8)
src/server/stateless_worker.cpp (1)
43-57:⚠️ Potential issue | 🟠 MajorReturn the generated
.pch/.pcmpath from these handlers.After
compile(cp, pch_info)/compile(cp, pcm_info), the only copy of the temp artifact path lives in those local info objects. Returning just success/error still drops the output location, so downstream dependency builds cannot consume the generated file. Populate the result frompch_info.path/pcm_info.pathbefore returning.Also applies to: 77-91
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateless_worker.cpp` around lines 43 - 57, The handler currently creates a temporary file (tmp), sets cp.output_file, constructs PCHInfo pch_info and calls compile(cp, pch_info) but only returns success/failure; update the return to include the produced artifact path by reading pch_info.path (or pcm_info.path in the analogous block) before returning: after compile(cp, pch_info) and before the successful return when unit.completed() is true, populate the result with the pch_info.path value (and do the same for pcm_info in the other handler) so downstream consumers receive the generated .pch/.pcm path.src/clice.cc (1)
30-40:⚠️ Potential issue | 🟠 MajorThe worker override flags are still no-ops in server modes.
--stateful-worker-count,--stateless-worker-count, and--worker-memory-limitare parsed, but thepipeandsocketbranches constructMasterServerwithout threading any of them through. Users can pass these flags and get silently ignored settings.Also applies to: 91-148
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/clice.cc` around lines 30 - 40, The three parsed flags (--stateful-worker-count, --stateless-worker-count, --worker-memory-limit) are read into stateful_worker_count, stateless_worker_count, and worker_memory_limit but never applied to the server; update the code paths that construct/configure MasterServer (the "pipe" and "socket" branches and the other similar server-mode branch around the 91-148 region) to pass these values into the MasterServer constructor or call its setters (e.g., setStatefulWorkerCount, setStatelessWorkerCount, setWorkerMemoryLimit) immediately after construction so the parsed overrides are actually honored.tests/unit/server/worker_test_helpers.h (1)
52-56:⚠️ Potential issue | 🟠 MajorUse unique temp names in these helpers.
Both the source fixture path and the worker stderr log path are deterministic. Parallel test processes can overwrite each other's files and make failures flaky.
Also applies to: 97-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/server/worker_test_helpers.h` around lines 52 - 56, TempFile's constructor currently builds deterministic temp paths (TempFile(const std::string& name, const std::string& content)) which causes collisions in parallel tests; update it to append a unique suffix (e.g., PID, timestamp, or random/UUID/llvm-generated unique string) to "clice_test_" + name (and similarly change the other helper that creates the worker stderr log at the other deterministic location referenced around lines 97-98) so each invocation creates a distinct directory/file name; ensure the uniqueness logic is applied when setting the member 'path' so tests won't overwrite each other's files.src/server/stateful_worker.cpp (4)
97-110:⚠️ Potential issue | 🔴 CriticalSerialize document mutations under
doc.strand.
CompileandDocumentUpdateoverwriteversion,text, and compile inputs before or without taking the per-document lock. That allows one revision to be compiled while another revision is already stored and reported back.Also applies to: 152-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateful_worker.cpp` around lines 97 - 110, The document fields (doc.version, doc.text, doc.directory, doc.arguments, doc.pch, doc.pcms) are being mutated before acquiring the per-document strand lock; acquire the lock returned by doc.strand.lock() before calling get_or_create/ mutating those fields and before calling touch_lru to serialize updates (i.e., move co_await doc.strand.lock() to just after getting the doc and before any assignments). Apply the same change to the analogous block referenced (the other DocumentUpdate/Compile mutation at the 152-160 area) so all per-document mutations happen while holding the doc.strand lock.
387-399:⚠️ Potential issue | 🟠 MajorDon't leave these feature handlers hard-wired to
[].These endpoints still never consult document state. If they remain on the advertised capability surface, clients will silently lose code actions and definitions instead of getting real results.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateful_worker.cpp` around lines 387 - 399, The handlers registered with peer.on_request for worker::CodeActionParams and worker::GoToDefinitionParams are returning a hard-coded empty JSON "[]", which hides real results and ignores document state; replace those stubs so each lambda fetches the target document from the server's document/store (using the existing document manager or APIs available on this or RequestContext), compute and return the real CodeAction or Location/LocationLink results derived from params.textDocument and params.position, and marshal them into the appropriate LSP response type instead of et::serde::RawValue{"[]"}; if you cannot implement them now, remove the advertised capability instead of returning empty arrays so clients don’t silently lose functionality.
64-74:⚠️ Potential issue | 🔴 CriticalDo not erase
DocumentEntrywhile coroutines can still reference it.Both eviction paths still call
documents.erase(...)immediately. Compile/feature coroutines capturedocby reference and can resume after the erase, which is a use-after-free.Also applies to: 164-170
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateful_worker.cpp` around lines 64 - 74, The code currently calls documents.erase(uri) inside shrink_if_over_limit (and the other eviction path) while coroutines may still hold references to the DocumentEntry, causing use-after-free; instead, stop erasing the entry immediately: remove the URI from lru and lru_index, set an "evicted" or "tombstone" flag on the DocumentEntry (or convert the map to hold shared_ptr<DocumentEntry> and rely on shared_ptr refcounts), send the EvictedParams notification via peer.send_notification, and defer the actual documents.erase(...) until you can guarantee no coroutines hold references (e.g., when a DocumentEntry->refcount drops to zero or at a safe cleanup point). Update shrink_if_over_limit and the other eviction path to follow this pattern and reference DocumentEntry, documents.erase, lru, lru_index, and peer.send_notification when applying the fix.
52-62:⚠️ Potential issue | 🔴 CriticalMake the LRU own its URIs.
touch_lru(params.uri)still pushes a non-owningStringRefbacked by request storage. After the handler returns, those entries can dangle and the next eviction walks freed memory.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateful_worker.cpp` around lines 52 - 62, The LRU currently stores non-owning llvm::StringRef entries which can dangle; change the LRU to own URIs by making lru a std::list<std::string> and lru_index an llvm::StringMap<std::list<std::string>::iterator>, then update touch_lru to copy the URI (e.g. lru.push_front(uri.str())) before storing the iterator in lru_index and continue erasing via the stored iterator; adjust the find/erase/update logic in touch_lru to use the new iterator/value types so map keys remain owned by the StringMap and the list nodes own the string data.src/server/worker_pool.cpp (1)
71-107:⚠️ Potential issue | 🟠 MajorUnwind earlier workers if a later spawn fails.
spawn_worker()schedulesdrain_stderr()andpeer->run()immediately. When a later spawn fails,start()returnsfalsebut the already-started workers keep running for the rest of the process lifetime.Also applies to: 110-120
🧹 Nitpick comments (7)
docs/en/architecture.md (1)
7-17: Add a language hint to this fenced block.Using
```text(orplain) keeps the ASCII diagram readable and clears MD040.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/architecture.md` around lines 7 - 17, The fenced ASCII diagram block in the docs currently has no language hint; update the code fence that wraps the diagram (the triple-backtick that begins the block containing the LSP Client / Master Server / Workers ASCII art) to use a plain/text language tag (e.g., change ``` to ```text or ```plain) so the diagram remains readable and suppresses MD040 warnings.src/server/master_server.cpp (5)
809-827: Reference todocmay be invalidated acrossco_await.On line 813,
docis bound todoc_it->second, but thenwp.text = doc.texton line 818 accesses this reference after the sync section ends. While there's noco_awaitbetween lines 813-822, the subsequentco_await pool.send_statelesson line 823 means this particular code is safe. However, consider copying required fields into thewpstruct immediately to avoid future refactoring risks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 809 - 827, The code binds doc = doc_it->second then uses doc when populating wp and later does a co_await; to avoid future dangling-reference bugs, eagerly copy any needed data out of doc into local variables or directly into worker::CompletionParams wp (e.g., set wp.version, wp.text, wp.uri, wp.line, wp.character) before the co_await and do not keep references to doc across the await; ensure fill_compile_args(path, wp.directory, wp.arguments) writes into wp members (or use local copies for directory/arguments) so pool.send_stateless<worker::CompletionParams>(wp) does not rely on doc living past the await.
286-361: Background indexer lacks cancellation awareness during idle wait.If
shutdownis requested whileco_await idle_timer->wait()is suspended on line 300, the coroutine won't exit until the timer fires. Consider using cancellation tokens or checking lifecycle after the wait.♻️ Suggested improvement
co_await idle_timer->wait(); + if(lifecycle == ServerLifecycle::ShuttingDown || lifecycle == ServerLifecycle::Exited) + co_return; if(index_queue.empty()) continue;
597-604:didSavehandler is a no-op.The handler only logs and doesn't trigger dependent file rebuilds as indicated by the TODO. For files that have dependents in the include graph, saving might warrant invalidation.
Would you like me to help implement the dependent rebuild logic, or should this remain a deferred TODO?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 597 - 604, The didSave handler currently only logs and must invalidate dependents and schedule rebuilds: inside the peer.on_notification lambda that handles protocol::DidSaveTextDocumentParams (and after the existing lifecycle check and log), call the code that looks up the saved file's node in the project's include graph (e.g. whatever module manages the include graph/file index — functions named like invalidateDependents, invalidateFile, or scheduleRebuild in your codebase), mark the saved file as up-to-date, recursively invalidate all dependent translation units/header consumers, and enqueue those dependents for rebuild or reanalysis (ensuring thread-safety via the project's build queue or mutex). Also clear or update diagnostics for the saved file if you have an API (e.g. publishDiagnostics/clearDiagnostics) so language server state stays consistent. Ensure you reuse existing helpers (invalidateDependents, scheduleRebuild, fileIndex/includeGraph) rather than duplicating logic.
469-480: Exit handler schedules async cleanup without waiting for completion.The
exitnotification handler schedules a coroutine for cleanup but returns immediately. If the event loop is stopped prematurely,pool.stop()may not complete cleanly.♻️ Consider synchronous cleanup or explicit ordering
Since
exitis a notification (no response expected), the current approach should work if the event loop drains pending tasks before exiting. However, verify thatloop.stop()waits for scheduled coroutines or consider making the cleanup synchronous.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 469 - 480, The exit notification currently sets lifecycle and schedules an async cleanup via loop.schedule(...) that calls compile_graph.cancel_all(), co_await pool.stop(), then loop.stop(), but returns immediately so loop may stop before pool.stop() completes; change the handler to ensure cleanup completes before stopping the loop: either perform synchronous/locking cleanup directly in the handler (call compile_graph.cancel_all(); pool.stop() synchronously or via wait on its future) and then call loop.stop(), or capture the scheduled task and wait for its completion before calling loop.stop(); modify the code around peer.on_notification(...) and the scheduled lambda that calls compile_graph.cancel_all() and pool.stop() so that pool.stop() is guaranteed to finish before loop.stop() is invoked (use explicit await/wait on the et::task returned by loop.schedule or perform pool.stop() on the current thread).
363-372:ensure_compiledis a stub that always returns true.The function is documented as needing to schedule a build and wait, but the implementation always returns true immediately. This means feature requests may be served with stale ASTs.
Would you like me to help implement the proper wait logic, or should a TODO comment be added to track this?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 363 - 372, The current MasterServer::ensure_compiled stub always returns true and must instead schedule a compile and wait for completion: locate MasterServer::ensure_compiled (params path_id, uri) and, after verifying documents contains path_id, enqueue or send a compile request for that document (e.g., via your existing build/worker queue or a new schedule_compile/send_compile_request helper), then wait asynchronously for the document's compile-complete signal (use an existing per-document compiled flag/version or add a promise/condition_variable or et::task that is fulfilled when compilation finishes) with a sensible timeout; on successful compile return true, on timeout or failure return false and log the error. Ensure you update or set the document's compilation state (e.g., documents[path_id].compiled_version or a documents[path_id].compile_promise) when workers report completion so the awaiting ensure_compiled can resume.src/server/compile_graph.cpp (1)
113-134: Redundant map lookups in synchronous code paths.Lines 113, 119, 122, and 129 perform repeated
units.find(path_id)andunits.find(dep_id)lookups within the same synchronous section before anyco_await. These could be consolidated.♻️ Suggested consolidation
- for(auto dep_id: units.find(path_id)->second.dependencies) { - auto dep_it = units.find(dep_id); + auto& my_deps = units.find(path_id)->second.dependencies; + for(auto dep_id: my_deps) { + auto dep_it = units.find(dep_id); if(dep_it == units.end()) continue; if(dep_it->second.compiling && dep_it->second.completion) { - auto wait_result = co_await et::with_token(dep_it->second.completion->wait(), - units.find(path_id)->second.source->token()) + auto& my_unit = units.find(path_id)->second; + auto wait_result = co_await et::with_token(dep_it->second.completion->wait(), + my_unit.source->token()) .catch_cancel();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/compile_graph.cpp` around lines 113 - 134, Store and reuse map iterators instead of calling units.find(path_id) and units.find(dep_id) repeatedly: at loop start capture auto cur_it = units.find(path_id) and auto dep_it = units.find(dep_id); use cur_it->second (or a reference cur = cur_it->second) wherever path unit is needed, and use dep_it while handling the synchronous branch before co_await; before awaiting you may still need to re-check presence after the co_await, so re-query into a clearly named dep_it_after (or reuse dep_it but reassign it) to test existence and dirty state. Replace the repeated units.find(...) calls on lines referenced (for path_id and dep_id) with these saved iterators/refs and only re-find dep_id after the co_await to handle possible concurrent removals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/server/master_server.cpp`:
- Around line 580-595: When handling textDocument/didClose in the
peer.on_notification lambda you must also remove the unit from the compile graph
to avoid stale entries; add a call to CompileGraph::unregister_unit(path_id) (or
compile_graph.unregister_unit(path_id)) alongside erasing documents and
debounce_timers and before/after clear_diagnostics. If CompileGraph lacks
unregister_unit, implement a method that undoes what register_unit did (removes
the unit node and any associated edges/state) and expose a safe API (e.g.,
unregister_unit(UnitId or PathId)) so the master_server.cpp closure can call it
when a document is closed.
In `@src/server/stateful_worker.cpp`:
- Around line 275-276: Replace the hard-coded full-file range when calling
feature::inlay_hints; convert the client's requested range (the inlay-hint
request parameter—e.g., the incoming request/params range variable) into a
LocalSourceRange and pass that to feature::inlay_hints(doc.unit, range) instead
of LocalSourceRange{0, static_cast<uint32_t>(doc.text.size())}; ensure the
conversion clamps to document bounds and still uses doc.unit and doc.text for
sizing.
In `@src/server/stateless_worker.cpp`:
- Around line 115-126: The current conversion of params.line/params.character to
byte offset can leave offset at 0 for trailing/after-last-line requests and
doesn't clamp character to the actual line length; update the logic that
computes offset (used when assigning cp.completion) to track the start index of
each line (or last line start if loop finishes without a match), compute the
line length up to the next '\n' or end-of-text, clamp params.character to that
line length, then set offset = lineStart + clampedCharacter (or text.size() if
the request is past EOF) so cp.completion = {params.uri, offset} always gets a
valid byte offset; apply the same fix at the other occurrence noted (the block
around the later lines referenced).
In `@src/server/worker_pool.cpp`:
- Around line 168-197: The code calls pick_least_loaded() which reads
stateful_workers[0] and can OOB when there are zero stateful workers; update
assign_worker to check whether stateful_workers.empty() (or a stateful_count ==
0 guard) before calling pick_least_loaded and reject the request (e.g., return
an error code, throw, or propagate a failure) when no stateful workers exist;
additionally harden pick_least_loaded to assert or throw if
stateful_workers.empty() to prevent silent OOB accesses. Ensure you adjust any
callers of assign_worker to handle the new error/exception path.
In `@tests/fixtures/transport.py`:
- Around line 47-75: The start() method can exit early from the connection wait
loop and leave self.process running, so update start() to ensure the spawned
subprocess is always torn down on failure: around the connection loop in
start(), catch failures (the ConnectionRefusedError/OSError branch and the final
timeout branch) and in those code paths call a teardown that kills and awaits
the subprocess (e.g., await self.stop() or explicitly call self.process.kill();
await self.process.wait() if self.process is not None) before re-raising the
RuntimeError; alternatively schedule _monitor_process immediately after creating
self.process or add a finally block that ensures stop() runs if the loop fails.
Ensure references to self.process, start(), stop(), and _monitor_process are
used so the correct methods are invoked.
In `@tests/unit/server/worker_test_helpers.h`:
- Around line 95-116: If et::process::spawn(opts, loop) fails and a debug stderr
file was opened, close the file descriptor to avoid leaking descriptors: after
calling auto result = et::process::spawn(opts, loop); check if (!result) and if
stderr_fd >= 0 then ::close(stderr_fd) (and optionally set stderr_fd = -1)
before returning false; this change touches the stderr_fd variable and the spawn
failure path where opts.streams is constructed.
---
Duplicate comments:
In `@src/clice.cc`:
- Around line 30-40: The three parsed flags (--stateful-worker-count,
--stateless-worker-count, --worker-memory-limit) are read into
stateful_worker_count, stateless_worker_count, and worker_memory_limit but never
applied to the server; update the code paths that construct/configure
MasterServer (the "pipe" and "socket" branches and the other similar server-mode
branch around the 91-148 region) to pass these values into the MasterServer
constructor or call its setters (e.g., setStatefulWorkerCount,
setStatelessWorkerCount, setWorkerMemoryLimit) immediately after construction so
the parsed overrides are actually honored.
In `@src/server/stateful_worker.cpp`:
- Around line 97-110: The document fields (doc.version, doc.text, doc.directory,
doc.arguments, doc.pch, doc.pcms) are being mutated before acquiring the
per-document strand lock; acquire the lock returned by doc.strand.lock() before
calling get_or_create/ mutating those fields and before calling touch_lru to
serialize updates (i.e., move co_await doc.strand.lock() to just after getting
the doc and before any assignments). Apply the same change to the analogous
block referenced (the other DocumentUpdate/Compile mutation at the 152-160 area)
so all per-document mutations happen while holding the doc.strand lock.
- Around line 387-399: The handlers registered with peer.on_request for
worker::CodeActionParams and worker::GoToDefinitionParams are returning a
hard-coded empty JSON "[]", which hides real results and ignores document state;
replace those stubs so each lambda fetches the target document from the server's
document/store (using the existing document manager or APIs available on this or
RequestContext), compute and return the real CodeAction or Location/LocationLink
results derived from params.textDocument and params.position, and marshal them
into the appropriate LSP response type instead of et::serde::RawValue{"[]"}; if
you cannot implement them now, remove the advertised capability instead of
returning empty arrays so clients don’t silently lose functionality.
- Around line 64-74: The code currently calls documents.erase(uri) inside
shrink_if_over_limit (and the other eviction path) while coroutines may still
hold references to the DocumentEntry, causing use-after-free; instead, stop
erasing the entry immediately: remove the URI from lru and lru_index, set an
"evicted" or "tombstone" flag on the DocumentEntry (or convert the map to hold
shared_ptr<DocumentEntry> and rely on shared_ptr refcounts), send the
EvictedParams notification via peer.send_notification, and defer the actual
documents.erase(...) until you can guarantee no coroutines hold references
(e.g., when a DocumentEntry->refcount drops to zero or at a safe cleanup point).
Update shrink_if_over_limit and the other eviction path to follow this pattern
and reference DocumentEntry, documents.erase, lru, lru_index, and
peer.send_notification when applying the fix.
- Around line 52-62: The LRU currently stores non-owning llvm::StringRef entries
which can dangle; change the LRU to own URIs by making lru a
std::list<std::string> and lru_index an
llvm::StringMap<std::list<std::string>::iterator>, then update touch_lru to copy
the URI (e.g. lru.push_front(uri.str())) before storing the iterator in
lru_index and continue erasing via the stored iterator; adjust the
find/erase/update logic in touch_lru to use the new iterator/value types so map
keys remain owned by the StringMap and the list nodes own the string data.
In `@src/server/stateless_worker.cpp`:
- Around line 43-57: The handler currently creates a temporary file (tmp), sets
cp.output_file, constructs PCHInfo pch_info and calls compile(cp, pch_info) but
only returns success/failure; update the return to include the produced artifact
path by reading pch_info.path (or pcm_info.path in the analogous block) before
returning: after compile(cp, pch_info) and before the successful return when
unit.completed() is true, populate the result with the pch_info.path value (and
do the same for pcm_info in the other handler) so downstream consumers receive
the generated .pch/.pcm path.
In `@tests/unit/server/worker_test_helpers.h`:
- Around line 52-56: TempFile's constructor currently builds deterministic temp
paths (TempFile(const std::string& name, const std::string& content)) which
causes collisions in parallel tests; update it to append a unique suffix (e.g.,
PID, timestamp, or random/UUID/llvm-generated unique string) to "clice_test_" +
name (and similarly change the other helper that creates the worker stderr log
at the other deterministic location referenced around lines 97-98) so each
invocation creates a distinct directory/file name; ensure the uniqueness logic
is applied when setting the member 'path' so tests won't overwrite each other's
files.
---
Nitpick comments:
In `@docs/en/architecture.md`:
- Around line 7-17: The fenced ASCII diagram block in the docs currently has no
language hint; update the code fence that wraps the diagram (the triple-backtick
that begins the block containing the LSP Client / Master Server / Workers ASCII
art) to use a plain/text language tag (e.g., change ``` to ```text or ```plain)
so the diagram remains readable and suppresses MD040 warnings.
In `@src/server/compile_graph.cpp`:
- Around line 113-134: Store and reuse map iterators instead of calling
units.find(path_id) and units.find(dep_id) repeatedly: at loop start capture
auto cur_it = units.find(path_id) and auto dep_it = units.find(dep_id); use
cur_it->second (or a reference cur = cur_it->second) wherever path unit is
needed, and use dep_it while handling the synchronous branch before co_await;
before awaiting you may still need to re-check presence after the co_await, so
re-query into a clearly named dep_it_after (or reuse dep_it but reassign it) to
test existence and dirty state. Replace the repeated units.find(...) calls on
lines referenced (for path_id and dep_id) with these saved iterators/refs and
only re-find dep_id after the co_await to handle possible concurrent removals.
In `@src/server/master_server.cpp`:
- Around line 809-827: The code binds doc = doc_it->second then uses doc when
populating wp and later does a co_await; to avoid future dangling-reference
bugs, eagerly copy any needed data out of doc into local variables or directly
into worker::CompletionParams wp (e.g., set wp.version, wp.text, wp.uri,
wp.line, wp.character) before the co_await and do not keep references to doc
across the await; ensure fill_compile_args(path, wp.directory, wp.arguments)
writes into wp members (or use local copies for directory/arguments) so
pool.send_stateless<worker::CompletionParams>(wp) does not rely on doc living
past the await.
- Around line 597-604: The didSave handler currently only logs and must
invalidate dependents and schedule rebuilds: inside the peer.on_notification
lambda that handles protocol::DidSaveTextDocumentParams (and after the existing
lifecycle check and log), call the code that looks up the saved file's node in
the project's include graph (e.g. whatever module manages the include graph/file
index — functions named like invalidateDependents, invalidateFile, or
scheduleRebuild in your codebase), mark the saved file as up-to-date,
recursively invalidate all dependent translation units/header consumers, and
enqueue those dependents for rebuild or reanalysis (ensuring thread-safety via
the project's build queue or mutex). Also clear or update diagnostics for the
saved file if you have an API (e.g. publishDiagnostics/clearDiagnostics) so
language server state stays consistent. Ensure you reuse existing helpers
(invalidateDependents, scheduleRebuild, fileIndex/includeGraph) rather than
duplicating logic.
- Around line 469-480: The exit notification currently sets lifecycle and
schedules an async cleanup via loop.schedule(...) that calls
compile_graph.cancel_all(), co_await pool.stop(), then loop.stop(), but returns
immediately so loop may stop before pool.stop() completes; change the handler to
ensure cleanup completes before stopping the loop: either perform
synchronous/locking cleanup directly in the handler (call
compile_graph.cancel_all(); pool.stop() synchronously or via wait on its future)
and then call loop.stop(), or capture the scheduled task and wait for its
completion before calling loop.stop(); modify the code around
peer.on_notification(...) and the scheduled lambda that calls
compile_graph.cancel_all() and pool.stop() so that pool.stop() is guaranteed to
finish before loop.stop() is invoked (use explicit await/wait on the et::task
returned by loop.schedule or perform pool.stop() on the current thread).
- Around line 363-372: The current MasterServer::ensure_compiled stub always
returns true and must instead schedule a compile and wait for completion: locate
MasterServer::ensure_compiled (params path_id, uri) and, after verifying
documents contains path_id, enqueue or send a compile request for that document
(e.g., via your existing build/worker queue or a new
schedule_compile/send_compile_request helper), then wait asynchronously for the
document's compile-complete signal (use an existing per-document compiled
flag/version or add a promise/condition_variable or et::task that is fulfilled
when compilation finishes) with a sensible timeout; on successful compile return
true, on timeout or failure return false and log the error. Ensure you update or
set the document's compilation state (e.g., documents[path_id].compiled_version
or a documents[path_id].compile_promise) when workers report completion so the
awaiting ensure_compiled can resume.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e0b5eaff-5334-4152-a84b-09caa6790b75
📒 Files selected for processing (14)
docs/en/architecture.mdsrc/clice.ccsrc/server/compile_graph.cppsrc/server/compile_graph.hsrc/server/config.cppsrc/server/master_server.cppsrc/server/master_server.hsrc/server/stateful_worker.cppsrc/server/stateless_worker.cppsrc/server/worker_pool.cppsrc/server/worker_pool.htests/conftest.pytests/fixtures/transport.pytests/unit/server/worker_test_helpers.h
✅ Files skipped from review due to trivial changes (1)
- src/server/worker_pool.h
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/conftest.py
- src/server/config.cpp
- src/server/compile_graph.h
- compile_graph: clean stale reverse edges on re-registration - master_server: re-lookup document after co_await to avoid dangling ref - master_server: don't set Ready until worker pool starts successfully - master_server: pass worker_memory_limit to pool options - master_server: send DocumentUpdateParams on didChange - stateful_worker: use std::string in LRU list to avoid dangling StringRef - tests: filter publishDiagnostics by target URI to prevent false matches - test_file_operation: clear diagnostics latch before triggering rebuild Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
src/server/master_server.cpp (2)
218-227:⚠️ Potential issue | 🟠 MajorDependency rescans still don't update the live compile graph.
scan_file()always passes emptycontent, sodidOpen/didChangerescans read includes from disk instead of the in-memory buffer. On top of that,didChange()refreshesinclude_forward[path_id]but never rebuildsdepsand re-runscompile_graph.register_unit(...), so the compile graph keeps the old adjacency. Unsaved#includeedits therefore won't invalidate or prebuild the right dependents.Minimal direction for the fix
-void MasterServer::scan_file(std::uint32_t path_id, llvm::StringRef path) { +void MasterServer::scan_file(std::uint32_t path_id, + llvm::StringRef path, + llvm::StringRef content) { auto ctx = cdb.lookup(path); if(ctx.arguments.empty()) return; auto results = scan_fuzzy(ctx.arguments, ctx.directory, /*arguments_from_database=*/true, - /*content=*/{}, + /*content=*/content, &scan_cache);Then call it with
doc.textfromdidOpen/didChange, rebuilddepsfrominclude_forward[path_id], and runcompile_graph.register_unit(path_id, deps)beforecompile_graph.update(path_id).Also applies to: 504-517, 583-587
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 218 - 227, scan_file currently always passes empty content which causes didOpen/didChange rescans to read from disk; update scan_file to accept and use an in-memory content buffer (pass doc.text from didOpen/didChange into scan_file) so includes are parsed from the open buffer, then in didChange/didOpen after updating include_forward[path_id] rebuild the deps vector from include_forward[path_id] and call compile_graph.register_unit(path_id, deps) before calling compile_graph.update(path_id) to ensure the live compile graph adjacency is refreshed for unsaved `#include` edits.
547-565:⚠️ Potential issue | 🟠 MajorIncremental edits still treat LSP characters as byte offsets.
Position.characteris not a UTF-8 byte index, soi + pos.charactermisplaces replacements as soon as the buffer contains multibyte code points and can walk past the end of the target line. Use the same UTF-16→UTF-8 mapper used elsewhere before callingdoc.text.replace().In the Language Server Protocol specification, is `Position.character` measured in UTF-16 code units or UTF-8 bytes, and what is the correct way for a server storing UTF-8 text to convert a `Position` into a byte offset?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 547 - 565, Position.character is in UTF-16 code units, so replace the current pos_to_offset lambda with logic that maps a (line, UTF-16 character) to a UTF-8 byte offset using the same UTF-16→UTF-8 mapper used elsewhere in the codebase (instead of returning i + pos.character); call that mapper for both range.start and range.end before calling doc.text.replace(), ensure the mapper clamps to line length and that start/end are validated (start <= end and within doc.text.size()) before performing the replace.
🧹 Nitpick comments (1)
tests/integration/test_server.py (1)
128-143: Wait for compilation before closing in this regression test.This closes
main.cppimmediately afterdid_open(), so the worker usually never has a compiled AST cached yet. The test can still pass even ifdidClosefails to block requests against previously compiled state. Reuse_wait_for_compilation()(or wait formain.cppdiagnostics) beforedid_close()so this actually covers the post-compile close path.Suggested test hardening
async def test_feature_requests_after_close(client: LSPClient, test_data_dir): """Feature requests on closed file should return null.""" workspace = test_data_dir / "hello_world" - await client.initialize(workspace) - await client.did_open("main.cpp") + await _wait_for_compilation(client, workspace) await client.did_close("main.cpp") # Hover on a closed file result = await client.send_request(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_server.py` around lines 128 - 143, The test test_feature_requests_after_close closes main.cpp immediately after client.did_open, so the worker may not have compiled and cached the AST; modify the test to wait for compilation before calling client.did_close by invoking the existing helper _wait_for_compilation(workspace, "main.cpp") (or awaiting main.cpp diagnostics via the test harness) after client.did_open and before client.did_close so the test exercises the post-compile close path correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/server/compile_graph.cpp`:
- Around line 153-162: compile_impl() (used by compile_deps()) is marking units
clean without actually dispatching PCH/PCM builds to the WorkerPool; to fix,
change compile_impl() to enqueue BuildPCH/BuildPCM tasks on pool (use the same
request types the MasterServer integration will expect), set
final_unit.compiling = true before dispatch, attach a completion/future callback
that on success clears dirty and sets completion->set(), and on failure records
the error and leaves dirty=true; ensure you use units.find(path_id)->second to
update the same final_unit and propagate pool task results rather than
immediately logging success and clearing flags.
In `@src/server/master_server.cpp`:
- Around line 130-152: run_build_drain() can match a newly reopened document
because didClose()/didOpen() recreate documents with a reused generation; fix by
introducing a monotonic session/epoch that survives close/reopen or by retiring
in-flight builds when path_id is reused. Specifically: add a persistent
session_id (or epoch) to the document metadata and have run_build_drain()
capture and compare that session_id (not just generation) after re-lookup before
publishing diagnostics or flipping doc2.build_running/doc2.drain_scheduled;
ensure didOpen() increments/assigns a new session_id and didClose()/document
recreation prevents old tasks from acting on the new session. Apply the same
session_id check in the other affected locations mentioned (around lines
referenced 494-497 and 598-599).
- Around line 638-639: The handlers call co_await ensure_compiled(...) but
ignore its bool result, allowing send_stateful(...) to run for closed/unopened
docs and leak cached worker state; capture the returned bool from
ensure_compiled (e.g., auto ok = co_await ensure_compiled(...)) and if it is
false short-circuit the handler (return early / send a JSON-RPC null result)
instead of calling send_stateful(...); apply this change to every handler that
currently does co_await ensure_compiled(...) followed by send_stateful(...)
(ensure_compiled, send_stateful are the identifying symbols).
---
Duplicate comments:
In `@src/server/master_server.cpp`:
- Around line 218-227: scan_file currently always passes empty content which
causes didOpen/didChange rescans to read from disk; update scan_file to accept
and use an in-memory content buffer (pass doc.text from didOpen/didChange into
scan_file) so includes are parsed from the open buffer, then in
didChange/didOpen after updating include_forward[path_id] rebuild the deps
vector from include_forward[path_id] and call
compile_graph.register_unit(path_id, deps) before calling
compile_graph.update(path_id) to ensure the live compile graph adjacency is
refreshed for unsaved `#include` edits.
- Around line 547-565: Position.character is in UTF-16 code units, so replace
the current pos_to_offset lambda with logic that maps a (line, UTF-16 character)
to a UTF-8 byte offset using the same UTF-16→UTF-8 mapper used elsewhere in the
codebase (instead of returning i + pos.character); call that mapper for both
range.start and range.end before calling doc.text.replace(), ensure the mapper
clamps to line length and that start/end are validated (start <= end and within
doc.text.size()) before performing the replace.
---
Nitpick comments:
In `@tests/integration/test_server.py`:
- Around line 128-143: The test test_feature_requests_after_close closes
main.cpp immediately after client.did_open, so the worker may not have compiled
and cached the AST; modify the test to wait for compilation before calling
client.did_close by invoking the existing helper
_wait_for_compilation(workspace, "main.cpp") (or awaiting main.cpp diagnostics
via the test harness) after client.did_open and before client.did_close so the
test exercises the post-compile close path correctly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d4e9c1af-8292-4f7a-a827-e51cf12cec79
📒 Files selected for processing (5)
src/server/compile_graph.cppsrc/server/master_server.cppsrc/server/stateful_worker.cpptests/integration/test_file_operation.pytests/integration/test_server.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/integration/test_file_operation.py
- src/server/stateful_worker.cpp
…rker protocol Replace URI strings with file paths and line/character pairs with byte offsets in the worker IPC protocol. Use PositionMapper for LSP position conversion in master_server, and adopt LOG_* macros in clice.cc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Close stderr_fd on spawn failure in worker test helpers to prevent fd leak - Guard send_stateful/send_stateless against empty worker pools to prevent out-of-bounds access when stateful_count or stateless_count is zero - Short-circuit all stateful feature requests when ensure_compiled() returns false (document not found) instead of forwarding to worker Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (4)
src/server/master_server.cpp (3)
442-448:⚠️ Potential issue | 🟠 MajorWire
pool.on_evictedbeforepool.start()to keep owner state accurate.Without this callback, worker-side evictions do not update master ownership bookkeeping, so affinity/load tracking drifts over time.
🔧 Minimal wiring
pool_opts.self_path = self_path; pool_opts.stateful_count = config.stateful_worker_count; pool_opts.stateless_count = config.stateless_worker_count; pool_opts.worker_memory_limit = config.worker_memory_limit; + pool.on_evicted = [this](const std::string& path) { + auto path_id = path_pool.intern(path); + pool.remove_owner(path_id); + }; if(!pool.start(pool_opts)) { LOG_ERROR("Failed to start worker pool"); return; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 442 - 448, The WorkerPool eviction callback is not registered before starting the pool, so master ownership bookkeeping (affinity/load tracking) isn't updated when workers are evicted; wire pool.on_evicted(...) prior to calling pool.start(pool_opts) using the pool instance and ensure the callback updates master state (owner tables/affinity bookkeeping) when an eviction event occurs (refer to pool.on_evicted, pool.start, and the WorkerPool/WorkerPoolOptions usage) so evictions immediately propagate to the master's ownership records.
87-102:⚠️ Potential issue | 🔴 CriticalPrevent stale build results from crossing close→reopen document sessions.
run_build_drain()validates onlygeneration. After close/reopen, a newdocuments[path_id]entry can be mistaken for the old session, allowing stale compile results/flags to apply to the reopened document.🛡️ Minimal direction
- auto gen = doc_it->second.generation; + auto gen = doc_it->second.generation; + auto session = doc_it->second.session_id; // add to DocumentState ... - auto& doc2 = doc_it->second; + auto& doc2 = doc_it->second; + if(doc2.session_id != session) { + co_return; // stale coroutine for prior document session + } if(result.has_value()) { if(doc2.generation == gen) { publish_diagnostics(uri, doc2.version, result.value().diagnostics); } }Also increment/assign
session_idon eachdidOpencreation path so sessions are monotonic across close/reopen.Also applies to: 131-153, 495-499, 588-590
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 87 - 102, The drain routine run_build_drain currently only checks generation and can apply stale results to a reopened document; add a per-session monotonic identifier (e.g., session_id) to the Document state, increment/assign session_id whenever you create a document on didOpen, and then validate session_id (in addition to generation) inside run_build_drain before applying build results or flags; also update the other callsites that only check generation to include session_id verification so closed→reopen sessions cannot reuse prior build state.
224-228:⚠️ Potential issue | 🟠 MajorUse in-memory document content for dependency rescans.
scan_file()always passes empty content, sodidChangerescans still read from disk and can miss unsaved#includeedits.♻️ Suggested fix
-void MasterServer::scan_file(std::uint32_t path_id, llvm::StringRef path) { +void MasterServer::scan_file(std::uint32_t path_id, + llvm::StringRef path, + std::string_view content) { ... auto results = scan_fuzzy(ctx.arguments, ctx.directory, /*arguments_from_database=*/true, - /*content=*/{}, + /*content=*/content, &scan_cache);- scan_file(path_id, path); + scan_file(path_id, path, doc.text);Also applies to: 573-575
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 224 - 228, scan_fuzzy is being called with an empty /*content=*/{} so rescans (e.g., from didChange) read from disk and miss unsaved edits; change the call sites (the scan_fuzzy invocation around the presented diff and the similar one at lines 573-575) to pass the in-memory document text when available (e.g., use the buffer/content held on ctx/ctx.arguments or the in-memory document cache used by scan_file), falling back to empty only if no in-memory content exists; ensure you propagate the same content to scan_fuzzy and any scan_file calls that currently pass /*content=*/{} (use ctx.arguments, ctx.directory, and scan_cache to locate the in-memory content).tests/unit/server/worker_test_helpers.h (1)
52-56:⚠️ Potential issue | 🟠 MajorUse unique temp/log paths to avoid cross-test collisions.
Line 55 and Line 97 still build deterministic
/tmp/...paths. Parallel test runs can overwrite each other’s source/log files and make failures nondeterministic.♻️ Suggested fix
+#include <atomic> ... namespace { +inline std::string make_unique_tmp_name(const std::string& prefix, const std::string& suffix = "") { +#ifndef _WIN32 + static std::atomic<std::uint64_t> seq{0}; + llvm::SmallString<256> tmp_dir; + llvm::sys::path::system_temp_directory(true, tmp_dir); + auto name = prefix + "_" + std::to_string(::getpid()) + "_" + std::to_string(seq++) + suffix; + llvm::sys::path::append(tmp_dir, name); + return std::string(tmp_dir); +#else + llvm::SmallString<256> tmp_dir; + llvm::sys::path::system_temp_directory(true, tmp_dir); + llvm::sys::path::append(tmp_dir, prefix + suffix); + return std::string(tmp_dir); +#endif +} ... TempFile(const std::string& name, const std::string& content) { - llvm::SmallString<256> tmp_dir; - llvm::sys::path::system_temp_directory(true, tmp_dir); - llvm::sys::path::append(tmp_dir, "clice_test_" + name); - path = std::string(tmp_dir); + path = make_unique_tmp_name("clice_test_" + name); std::ofstream ofs(path); ofs << content; } ... - std::string stderr_path = "/tmp/clice_worker_stderr_" + mode + ".log"; + std::string stderr_path = make_unique_tmp_name("clice_worker_stderr_" + mode, ".log"); stderr_fd = ::open(stderr_path.c_str(), O_WRONLY | O_CREAT | O_TRUNC, 0644);Also applies to: 97-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/server/worker_test_helpers.h` around lines 52 - 56, The TempFile constructor currently builds a deterministic temp path ("clice_test_"+name) which can collide in parallel tests; update TempFile (and the other similar path code around lines 97-98) to create unique directories/files by appending a per-run unique suffix (e.g., process id, timestamp, or better: use llvm::sys::fs::createUniqueDirectory or createUniqueFile API) to the "clice_test_"+name component so each test gets its own temp/log path and avoid cross-test overwrites.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/server/master_server.cpp`:
- Around line 442-448: The WorkerPool eviction callback is not registered before
starting the pool, so master ownership bookkeeping (affinity/load tracking)
isn't updated when workers are evicted; wire pool.on_evicted(...) prior to
calling pool.start(pool_opts) using the pool instance and ensure the callback
updates master state (owner tables/affinity bookkeeping) when an eviction event
occurs (refer to pool.on_evicted, pool.start, and the
WorkerPool/WorkerPoolOptions usage) so evictions immediately propagate to the
master's ownership records.
- Around line 87-102: The drain routine run_build_drain currently only checks
generation and can apply stale results to a reopened document; add a per-session
monotonic identifier (e.g., session_id) to the Document state, increment/assign
session_id whenever you create a document on didOpen, and then validate
session_id (in addition to generation) inside run_build_drain before applying
build results or flags; also update the other callsites that only check
generation to include session_id verification so closed→reopen sessions cannot
reuse prior build state.
- Around line 224-228: scan_fuzzy is being called with an empty /*content=*/{}
so rescans (e.g., from didChange) read from disk and miss unsaved edits; change
the call sites (the scan_fuzzy invocation around the presented diff and the
similar one at lines 573-575) to pass the in-memory document text when available
(e.g., use the buffer/content held on ctx/ctx.arguments or the in-memory
document cache used by scan_file), falling back to empty only if no in-memory
content exists; ensure you propagate the same content to scan_fuzzy and any
scan_file calls that currently pass /*content=*/{} (use ctx.arguments,
ctx.directory, and scan_cache to locate the in-memory content).
In `@tests/unit/server/worker_test_helpers.h`:
- Around line 52-56: The TempFile constructor currently builds a deterministic
temp path ("clice_test_"+name) which can collide in parallel tests; update
TempFile (and the other similar path code around lines 97-98) to create unique
directories/files by appending a per-run unique suffix (e.g., process id,
timestamp, or better: use llvm::sys::fs::createUniqueDirectory or
createUniqueFile API) to the "clice_test_"+name component so each test gets its
own temp/log path and avoid cross-test overwrites.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 01f17746-cb5c-490d-9827-086ba0b3a590
📒 Files selected for processing (3)
src/server/master_server.cppsrc/server/worker_pool.htests/unit/server/worker_test_helpers.h
…workers - Add stderr_logger for master server in pipe/socket modes so logs are visible - Remove scan_file, compile_graph deps check, and background indexer from build flow (not needed until PCH/PCM support is added) - Add ast_ready event to stateful worker so feature handlers wait for compilation instead of returning null immediately - Remove dirty document early-return in feature handlers (stale results > null) - Rewrite integration tests with pygls and stronger assertions for features - Format all changed files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
47a4d06 to
9407f52
Compare
- Remove redundant lifecycle checks from feature request handlers - Remove explicit template args from send_stateful/send_stateless calls - Extract with_ast() helper in stateful worker for AST lock pattern - Extract ScopedTimer, fill_args, to_raw helpers in both workers - Simplify lifecycle test to match new server behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Leverage eventide#72 which allows on_request to accept RawValue return type with the typed overload, inferring method from params. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The hello_world test file included <iostream> which compiles very slowly under ASAN on Windows, causing timeout failures. Replace with self-contained arithmetic code and update test positions accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add forward_stateful, forward_stateful (with position), and forward_stateless template helpers to eliminate repeated boilerplate across 10 feature request handlers in MasterServer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace manual loop + push_back with ranges::views::transform and ranges::to<vector>. Remove unused <cstddef> include. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Add lsp, refl namespace aliases to reduce eventide::ipc::lsp:: and eventide::refl:: verbosity. Fix ranges::to pipe style for clang-21. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
src/server/stateless_worker.cpp (1)
62-95:⚠️ Potential issue | 🟠 MajorBuildPCH/BuildPCM results don't include the generated output path.
The handlers create temporary files (
clice-pch.*.pch,clice-pcm.*.pcm) but only return{success, error}. The caller (master server) cannot know where the PCH/PCM file was written. If the caller needs to pass this path to subsequent compilations (e.g., for code completion withcp.pch), the current design won't support that workflow.If this is intentional (e.g., stateless workers are ephemeral and paths aren't reused), consider adding a comment clarifying the design. Otherwise, extend
BuildPCHResult/BuildPCMResultinprotocol.hto includeoutput_path.,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateless_worker.cpp` around lines 62 - 95, The BuildPCH handler currently writes the PCH to a temporary file but returns only success/error; update the protocol and handlers to return the generated path so the master can reuse it: add an output_path (string) field to BuildPCHResult (and BuildPCMResult) in protocol.h, have the BuildPCH lambda capture the created temporary path (cp.output_file / tmp) and set that output_path on success when returning {true, ""} (and similarly update the BuildPCM handler to populate output_path on success); ensure the handlers return the new struct shape and update any callsites/serializers that construct or consume BuildPCHResult/BuildPCMResult accordingly or add a comment if the stateless design intentionally disallows reuse.src/server/stateful_worker.cpp (1)
217-231:⚠️ Potential issue | 🟠 MajorData race:
DocumentUpdatemodifiesdoc.textwithout acquiring the strand.While a compile is in progress (holding the strand and running on the thread pool with
doc.text), aDocumentUpdatenotification can arrive and modifydoc.textwithout synchronization:// DocumentUpdate handler (no strand lock) doc.version = params.version; doc.text = params.text; // Race with compile lambda using doc.text doc.dirty.store(true, std::memory_order_release);This can corrupt data if the compile's thread-pool lambda is reading
doc.textwhile the IPC thread writes to it.🛡️ Suggested fix: acquire strand in DocumentUpdate
peer.on_notification([this](const worker::DocumentUpdateParams& params) { LOG_TRACE("DocumentUpdate: path={}, version={}", params.path, params.version); auto it = documents.find(params.path); if(it == documents.end()) { LOG_TRACE("DocumentUpdate ignored (not tracked): path={}", params.path); return; } auto& doc = *it->second; + // Must lock strand to safely update doc.text while compile may be running + loop.schedule([&doc, params]() -> et::task<> { + co_await doc.strand.lock(); + doc.version = params.version; + doc.text = params.text; + doc.dirty.store(true, std::memory_order_release); + doc.strand.unlock(); + }()); - doc.version = params.version; - doc.text = params.text; - doc.dirty.store(true, std::memory_order_release); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateful_worker.cpp` around lines 217 - 231, The DocumentUpdate notification handler mutates shared DocumentState (doc.version, doc.text, doc.dirty) without using the worker strand, causing a data race with compile lambdas that access doc.text; fix by running the update on the same strand as other document operations (i.e., dispatch the DocumentUpdate handler body to the worker strand or acquire the same strand guard used elsewhere) so modifications to the DocumentState (the object referenced by documents.find(...), its version, text, and dirty atomic) occur serialized with the compile lambda; update the peer.on_notification(DocumentUpdate) lambda to capture params and forward the mutation work onto the strand before assigning doc.version/doc.text/doc.dirty.src/server/master_server.cpp (1)
377-392:⚠️ Potential issue | 🟡 MinorMissing:
WorkerPool::on_evictedcallback is not wired up.The worker pool's
on_evictedcallback should be assigned beforepool.start()so that when a stateful worker evicts a document, the master can update ownership tracking. Without this, eviction notifications from workers won't update the master's affinity routing.🔧 Suggested fix: wire up on_evicted
WorkerPoolOptions pool_opts; pool_opts.self_path = self_path; pool_opts.stateful_count = config.stateful_worker_count; pool_opts.stateless_count = config.stateless_worker_count; pool_opts.worker_memory_limit = config.worker_memory_limit; + pool.on_evicted = [this](const std::string& path) { + auto path_id = path_pool.intern(path); + pool.remove_owner(path_id); + }; if(!pool.start(pool_opts)) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/master_server.cpp` around lines 377 - 392, The WorkerPool's on_evicted callback is never set before calling pool.start(), so stateful worker evictions aren't propagated to the master; assign pool.on_evicted to a handler that updates master ownership/affinity tracking (the existing master method that handles evictions) before calling pool.start() and use pool_opts as configured; ensure the handler captures or references the master instance/state needed to update routing so eviction events from workers trigger the same ownership-update logic the master uses elsewhere.
🧹 Nitpick comments (8)
src/syntax/scan.cpp (1)
234-266: Minor inefficiency: empty filtered results will be recomputed on each call.The condition
if(slot && !slot->empty())returns cached results only when the slot is non-null AND non-empty. If all directives are filtered out (producing an empty vector), subsequent calls for the same entry will re-filter every time.Consider using a different sentinel to distinguish "not yet computed" from "computed but empty":
♻️ Suggested fix
llvm::ArrayRef<clang::dependency_directives_scan::Directive> get_directives(SharedScanCache::CachedEntry& entry) { if(mode == ScanMode::Precise) { return entry.directives; } // Fuzzy mode: strip `#define/`#undef and ALL conditional directives, // so every `#include` is processed unconditionally by the preprocessor. auto& slot = filtered_directives[&entry]; - if(slot && !slot->empty()) { + if(slot) { return *slot; } slot = std::make_unique<DirectiveVec>();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/syntax/scan.cpp` around lines 234 - 266, The cached result check currently only returns when the slot is non-null AND non-empty, causing empty filtered DirectiveVec results to be recomputed; update the check to treat a non-null slot as "computed" (i.e., if(slot) return *slot;) so that an allocated empty DirectiveVec is cached and reused; adjust the condition around filtered_directives[&entry] (the slot variable) and keep the existing allocation via std::make_unique<DirectiveVec>() so empty results are stored and subsequent calls return the cached empty vector.tests/conftest.py (1)
149-171: Consider logging teardown exceptions instead of silently swallowing them.The static analysis flags multiple
try-except-passblocks catching broadException. While swallowing errors in test teardown is common to avoid masking test failures, logging these exceptions would help debug flaky tests.♻️ Proposed fix: add debug logging
+ import logging + logger = logging.getLogger(__name__) + # Graceful shutdown try: await asyncio.wait_for(c.shutdown_async(None), timeout=3.0) - except Exception: - pass + except Exception as e: + logger.debug("shutdown_async failed: %s", e) try: c.exit(None) - except Exception: - pass + except Exception as e: + logger.debug("exit failed: %s", e) # Wait briefly, then force-kill if still running await asyncio.sleep(0.3) if hasattr(c, "_server") and c._server is not None and c._server.returncode is None: c._server.kill() # Stop pygls client (with timeout to avoid hanging) try: c._stop_event.set() for task in c._async_tasks: task.cancel() await asyncio.sleep(0.1) - except Exception: - pass + except Exception as e: + logger.debug("client cleanup failed: %s", e)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/conftest.py` around lines 149 - 171, The teardown code currently swallows all exceptions (try/except: pass) making flakes hard to debug; update each except block that wraps c.shutdown_async, c.exit, the c._server.kill branch, and the c._stop_event/task cancellation block to catch Exception as e and log the exception (e.g., logger.debug or logger.exception with exc_info=True) including contextual text (like "shutdown_async failed", "exit failed", "server kill failed", "async tasks stop failed") while still allowing teardown to continue; reference the symbols c.shutdown_async, c.exit, c._server.kill, c._stop_event.set, and c._async_tasks/task.cancel so the logging is added next to the corresponding try/except blocks.src/clice.cc (1)
117-156: Socket mode accepts only one client, then exits.The socket mode accepts a single connection, serves it, then stops the event loop and exits. This is fine for debugging/testing but may not match user expectations for a socket server. Consider adding a comment clarifying this is intentional for debugger attachment, or implement a loop to accept multiple connections.
💡 Optional: Document single-client behavior or add comment
if(mode == "socket") { clice::logging::stderr_logger("master", clice::logging::options); + // Note: Socket mode accepts only one client for debugger attachment. + // For multi-client support, run multiple instances or use pipe mode. namespace et = eventide; et::event_loop loop;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/clice.cc` around lines 117 - 156, The socket branch currently accepts one client in the task lambda (acceptor->accept()), serves it, then calls loop.stop() and exits; either add a clear comment above the mode == "socket" block stating this single-client, debugger-oriented behavior, or change the implementation to accept multiple clients by turning the task lambda into a persistent accept loop that repeatedly calls acceptor->accept(), spawns a per-client handler (creating et::ipc::StreamTransport, et::ipc::JsonPeer, and clice::MasterServer for each connection) without calling loop.stop() after each client, and only stops the loop on fatal error or external shutdown. Ensure references to acceptor->accept(), the task lambda, loop.stop(), et::ipc::JsonPeer, and clice::MasterServer are updated accordingly.tests/integration/test_lifecycle.py (1)
27-38: Consider adding an assertion after shutdown.The
test_shutdowntest callsshutdown_asyncbut doesn't assert anything about the result. Adding a simple assertion (e.g., that no exception was raised or that the result isNoneper LSP spec) would make the test more explicit.💡 Optional: Add shutdown result assertion
client.initialized(InitializedParams()) - await client.shutdown_async(None) + result = await client.shutdown_async(None) + assert result is None # LSP spec: shutdown returns null🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_lifecycle.py` around lines 27 - 38, The test_shutdown test calls client.shutdown_async(None) but does not assert its result; update the test_shutdown function to capture the return value of client.shutdown_async (e.g., result = await client.shutdown_async(None)) and add an assertion that the result is None (or another expected value per the LSP spec) to make the test explicit that shutdown returned the expected value; reference the test name test_shutdown and the shutdown_async method when making the change.tests/integration/test_file_operation.py (1)
53-58: Consider reducing the hard-coded sleep intest_did_open.The test uses a 5-second sleep after
did_open. If the goal is to verify the server handles the notification without crashing, a shorter sleep (or waiting for diagnostics) would make CI faster.💡 Optional: Use diagnostics wait instead of fixed sleep
`@pytest.mark.asyncio` async def test_did_open(client, test_data_dir): workspace = test_data_dir / "hello_world" await _init(client, workspace) - _open(client, workspace / "main.cpp") - await asyncio.sleep(5) + uri, _ = _open(client, workspace / "main.cpp") + event = client.wait_for_diagnostics(uri) + await asyncio.wait_for(event.wait(), timeout=30.0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_file_operation.py` around lines 53 - 58, The test_did_open uses a hard-coded 5s sleep after calling _open; replace this with a deterministic wait for the server reaction: either shorten the sleep to a small value (e.g., 0.1–0.5s) if only crash/no-crash is being validated, or better, await a diagnostics/notification helper instead of sleeping (e.g., wait for "textDocument/publishDiagnostics" or call an existing wait_for_diagnostics helper) after invoking _open so the test proceeds as soon as the server responds; update test_did_open to use the notification/diagnostics wait and remove the fixed 5-second asyncio.sleep.tests/integration/test_server.py (1)
139-148: Use a more specific exception type inpytest.raises.The test catches the broad
Exceptiontype, which could mask unrelated errors. The LSP server should return a specific error when receiving a secondinitializerequest. Consider catching a more specific exception type if pygls provides one (e.g.,JsonRpcExceptionor similar).♻️ Optional: Use specific exception type
`@pytest.mark.asyncio` async def test_double_initialize_rejected(client, test_data_dir): await _initialize(client, test_data_dir) - with pytest.raises(Exception): + # LSP spec: server should return InvalidRequest error for duplicate initialize + with pytest.raises(Exception) as exc_info: await client.initialize_async( InitializeParams( capabilities=ClientCapabilities(), workspace_folders=[], ) ) + # Optionally verify it's the expected error type/code🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_server.py` around lines 139 - 148, The test test_double_initialize_rejected currently catches the broad Exception; change it to assert the specific RPC error raised when sending a second initialize request by replacing pytest.raises(Exception) with pytest.raises(JsonRpcException) (or the concrete exception type exported by pygls in this repo, e.g., JsonRpcError/RpcError), add the corresponding import, and keep the rest of the call to client.initialize_async(InitializeParams(...)) unchanged so the test verifies the LSP server returns the expected specific error for double-initialization.src/server/stateless_worker.cpp (1)
204-213: Index handler returns placeholder data.The
TODOcomment indicates that TUIndex generation from the compilation unit is not yet implemented. The handler currently returns an emptytu_index_datastring on success, which may cause issues for callers expecting valid index data.Would you like me to help implement the TUIndex generation, or should I open an issue to track this work?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/stateless_worker.cpp` around lines 204 - 213, The handler currently returns an empty tu_index_data; replace the TODO by generating a TUIndex from the compilation unit `unit` and returning its serialized form as the third element of the result tuple. Locate the block using `auto unit = compile(cp);` and, after the `unit.completed()` check and before the success return, call the codepath that converts the compilation unit into a `TUIndex` (e.g., via the project’s API such as a method on the compilation unit like `unit.toTUIndex()` or a helper `buildTUIndex(unit)`), serialize it (e.g., `serializeTUIndex(tuIndex)` or `tuIndex.SerializeAsString()`), handle any serialization errors by logging and returning failure, and then return `{true, "", serialized_tu_index}` instead of the empty string. Ensure you reference `TUIndex`, `compile(cp)`, `unit.completed()`, and the result tuple when making the change.tests/unit/server/stateful_worker_tests.cpp (1)
410-436: Test validates no-crash behavior but doesn't test actual eviction.This test sends an evict notification for a document that was never compiled, so the hover returning
"null"doesn't actually verify eviction behavior — it was already going to return"null"since the document never existed in the worker.Consider adding a test that:
- Compiles a document (hover returns non-null)
- Sends evict notification
- Verifies hover now returns
"null"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/server/stateful_worker_tests.cpp` around lines 410 - 436, The test EvictNotification currently only sends an EvictParams for a document that was never present, so change the test to first ensure the worker has the document compiled (call whatever request the suite uses to compile/parse a document so hover returns non-null for "/tmp/evict_test.cpp" — e.g., send the compile/open request and await a successful response), then perform worker::EvictParams ep as before, and finally re-run worker::HoverParams hp and assert the hover result is now "null"; keep using WorkerHandle methods spawn, run, peer->send_request/send_notification and peer->close_output to locate where to make the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/clice.cc`:
- Around line 81-84: The init_resource_dir failure is only logged but execution
continues; update main so that after calling
clice::fs::init_resource_dir(self_path) it returns a non-zero exit (e.g., return
1) on failure instead of continuing, or explicitly document why continuing is
safe; locate the call using symbols self_path, llvm::sys::fs::getMainExecutable,
clice::fs::init_resource_dir and LOG_ERROR in main and make the function exit
early with an error code when init_resource_dir returns false.
In `@src/server/stateless_worker.cpp`:
- Around line 32-39: fill_args currently pushes arg.c_str() into cp.arguments
causing dangling pointers because those c_str() values refer to the
temporary/outer arguments vector (see fill_args, CompilationParams,
cp.arguments, and params captured into et::queue). Fix by making
CompilationParams own the strings (e.g., add a std::vector<std::string> member
like arguments_owned or change cp.arguments to be vector<string>) and update
fill_args to copy each argument into that owned vector before storing any char*
references (or remove char* usage and pass strings directly) so the data
outlives the queued work.
In `@tests/conftest.py`:
- Around line 115-128: The current compile_commands.json generator always embeds
"clang++" which may not exist; change the logic around hw_dir/main_cpp/cdb_path
to detect an available C++ compiler (e.g. check shutil.which for "clang++",
"g++", "c++" or "cc") and use the first found executable in the "arguments"
array, or fall back to emitting a compiler-agnostic entry (use "command" with
the detected binary or include only the source file and flags) so the LSP can
run on machines without clang++; update the code that writes cdb (the cdb list
creation) to substitute the detected compiler executable name accordingly.
---
Duplicate comments:
In `@src/server/master_server.cpp`:
- Around line 377-392: The WorkerPool's on_evicted callback is never set before
calling pool.start(), so stateful worker evictions aren't propagated to the
master; assign pool.on_evicted to a handler that updates master
ownership/affinity tracking (the existing master method that handles evictions)
before calling pool.start() and use pool_opts as configured; ensure the handler
captures or references the master instance/state needed to update routing so
eviction events from workers trigger the same ownership-update logic the master
uses elsewhere.
In `@src/server/stateful_worker.cpp`:
- Around line 217-231: The DocumentUpdate notification handler mutates shared
DocumentState (doc.version, doc.text, doc.dirty) without using the worker
strand, causing a data race with compile lambdas that access doc.text; fix by
running the update on the same strand as other document operations (i.e.,
dispatch the DocumentUpdate handler body to the worker strand or acquire the
same strand guard used elsewhere) so modifications to the DocumentState (the
object referenced by documents.find(...), its version, text, and dirty atomic)
occur serialized with the compile lambda; update the
peer.on_notification(DocumentUpdate) lambda to capture params and forward the
mutation work onto the strand before assigning doc.version/doc.text/doc.dirty.
In `@src/server/stateless_worker.cpp`:
- Around line 62-95: The BuildPCH handler currently writes the PCH to a
temporary file but returns only success/error; update the protocol and handlers
to return the generated path so the master can reuse it: add an output_path
(string) field to BuildPCHResult (and BuildPCMResult) in protocol.h, have the
BuildPCH lambda capture the created temporary path (cp.output_file / tmp) and
set that output_path on success when returning {true, ""} (and similarly update
the BuildPCM handler to populate output_path on success); ensure the handlers
return the new struct shape and update any callsites/serializers that construct
or consume BuildPCHResult/BuildPCMResult accordingly or add a comment if the
stateless design intentionally disallows reuse.
---
Nitpick comments:
In `@src/clice.cc`:
- Around line 117-156: The socket branch currently accepts one client in the
task lambda (acceptor->accept()), serves it, then calls loop.stop() and exits;
either add a clear comment above the mode == "socket" block stating this
single-client, debugger-oriented behavior, or change the implementation to
accept multiple clients by turning the task lambda into a persistent accept loop
that repeatedly calls acceptor->accept(), spawns a per-client handler (creating
et::ipc::StreamTransport, et::ipc::JsonPeer, and clice::MasterServer for each
connection) without calling loop.stop() after each client, and only stops the
loop on fatal error or external shutdown. Ensure references to
acceptor->accept(), the task lambda, loop.stop(), et::ipc::JsonPeer, and
clice::MasterServer are updated accordingly.
In `@src/server/stateless_worker.cpp`:
- Around line 204-213: The handler currently returns an empty tu_index_data;
replace the TODO by generating a TUIndex from the compilation unit `unit` and
returning its serialized form as the third element of the result tuple. Locate
the block using `auto unit = compile(cp);` and, after the `unit.completed()`
check and before the success return, call the codepath that converts the
compilation unit into a `TUIndex` (e.g., via the project’s API such as a method
on the compilation unit like `unit.toTUIndex()` or a helper
`buildTUIndex(unit)`), serialize it (e.g., `serializeTUIndex(tuIndex)` or
`tuIndex.SerializeAsString()`), handle any serialization errors by logging and
returning failure, and then return `{true, "", serialized_tu_index}` instead of
the empty string. Ensure you reference `TUIndex`, `compile(cp)`,
`unit.completed()`, and the result tuple when making the change.
In `@src/syntax/scan.cpp`:
- Around line 234-266: The cached result check currently only returns when the
slot is non-null AND non-empty, causing empty filtered DirectiveVec results to
be recomputed; update the check to treat a non-null slot as "computed" (i.e.,
if(slot) return *slot;) so that an allocated empty DirectiveVec is cached and
reused; adjust the condition around filtered_directives[&entry] (the slot
variable) and keep the existing allocation via std::make_unique<DirectiveVec>()
so empty results are stored and subsequent calls return the cached empty vector.
In `@tests/conftest.py`:
- Around line 149-171: The teardown code currently swallows all exceptions
(try/except: pass) making flakes hard to debug; update each except block that
wraps c.shutdown_async, c.exit, the c._server.kill branch, and the
c._stop_event/task cancellation block to catch Exception as e and log the
exception (e.g., logger.debug or logger.exception with exc_info=True) including
contextual text (like "shutdown_async failed", "exit failed", "server kill
failed", "async tasks stop failed") while still allowing teardown to continue;
reference the symbols c.shutdown_async, c.exit, c._server.kill,
c._stop_event.set, and c._async_tasks/task.cancel so the logging is added next
to the corresponding try/except blocks.
In `@tests/integration/test_file_operation.py`:
- Around line 53-58: The test_did_open uses a hard-coded 5s sleep after calling
_open; replace this with a deterministic wait for the server reaction: either
shorten the sleep to a small value (e.g., 0.1–0.5s) if only crash/no-crash is
being validated, or better, await a diagnostics/notification helper instead of
sleeping (e.g., wait for "textDocument/publishDiagnostics" or call an existing
wait_for_diagnostics helper) after invoking _open so the test proceeds as soon
as the server responds; update test_did_open to use the notification/diagnostics
wait and remove the fixed 5-second asyncio.sleep.
In `@tests/integration/test_lifecycle.py`:
- Around line 27-38: The test_shutdown test calls client.shutdown_async(None)
but does not assert its result; update the test_shutdown function to capture the
return value of client.shutdown_async (e.g., result = await
client.shutdown_async(None)) and add an assertion that the result is None (or
another expected value per the LSP spec) to make the test explicit that shutdown
returned the expected value; reference the test name test_shutdown and the
shutdown_async method when making the change.
In `@tests/integration/test_server.py`:
- Around line 139-148: The test test_double_initialize_rejected currently
catches the broad Exception; change it to assert the specific RPC error raised
when sending a second initialize request by replacing pytest.raises(Exception)
with pytest.raises(JsonRpcException) (or the concrete exception type exported by
pygls in this repo, e.g., JsonRpcError/RpcError), add the corresponding import,
and keep the rest of the call to client.initialize_async(InitializeParams(...))
unchanged so the test verifies the LSP server returns the expected specific
error for double-initialization.
In `@tests/unit/server/stateful_worker_tests.cpp`:
- Around line 410-436: The test EvictNotification currently only sends an
EvictParams for a document that was never present, so change the test to first
ensure the worker has the document compiled (call whatever request the suite
uses to compile/parse a document so hover returns non-null for
"/tmp/evict_test.cpp" — e.g., send the compile/open request and await a
successful response), then perform worker::EvictParams ep as before, and finally
re-run worker::HoverParams hp and assert the hover result is now "null"; keep
using WorkerHandle methods spawn, run, peer->send_request/send_notification and
peer->close_output to locate where to make the changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 32c90f32-97dd-49ca-93fe-aec370c99937
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (26)
CMakeLists.txtpixi.tomlsrc/clice.ccsrc/compile/compilation.cppsrc/compile/compilation.hsrc/compile/toolchain.cppsrc/server/master_server.cppsrc/server/master_server.hsrc/server/stateful_worker.cppsrc/server/stateless_worker.cppsrc/server/worker_pool.hsrc/syntax/scan.cppsrc/syntax/scan.htests/conftest.pytests/data/hello_world/main.cpptests/fixtures/__init__.pytests/fixtures/client.pytests/fixtures/transport.pytests/integration/test_file_operation.pytests/integration/test_lifecycle.pytests/integration/test_server.pytests/unit/compile/toolchain_tests.cpptests/unit/server/stateful_worker_tests.cpptests/unit/server/worker_test_helpers.htests/unit/syntax/scan_tests.cpptests/unit/test/tester.cpp
💤 Files with no reviewable changes (5)
- tests/unit/test/tester.cpp
- src/compile/compilation.h
- src/syntax/scan.h
- tests/fixtures/client.py
- tests/fixtures/transport.py
✅ Files skipped from review due to trivial changes (2)
- pixi.toml
- tests/unit/server/worker_test_helpers.h
🚧 Files skipped from review as they are similar to previous changes (3)
- src/compile/toolchain.cpp
- src/server/worker_pool.h
- src/server/master_server.h
Summary
Implements the core multi-process LSP server architecture for clice:
Key design decisions
--socket <port>) for debugger attachment during developmentAlso includes
#include <iostream>for faster ASAN CITest plan
Summary by CodeRabbit
Release Notes
New Features
clice.tomlwith workspace variable substitutionTests
Bug Fixes