Skip to content

feat(server): improve configuration file handling#423

Open
16bit-ykiko wants to merge 1 commit intomainfrom
improve/config-handling
Open

feat(server): improve configuration file handling#423
16bit-ykiko wants to merge 1 commit intomainfrom
improve/config-handling

Conversation

@16bit-ykiko
Copy link
Copy Markdown
Member

@16bit-ykiko 16bit-ykiko commented Apr 12, 2026

Summary

  • [[rules]]: TOML array-of-tables config for per-file compilation flag rules with glob pattern matching (append/remove fields). Patterns are pre-compiled at config load time.
  • CDB auto-scan: Default search scans workspace root + all immediate subdirectories for compile_commands.json, replacing the hardcoded directory list.
  • LSP initializationOptions: Clients can pass config as a JSON object via the LSP initialize request, with priority over clice.toml.
  • XDG cache paths: Default cache/index/logging paths prefer $XDG_CACHE_HOME/clice/<workspace-hash>/, falling back to <workspace>/.clice/ on failure.
  • Partial config support: All TOML/JSON fields are now optional via eventide::serde::defaulted<T>, so minimal config files work correctly.

Test plan

  • 11 new unit tests covering config parsing, match_rules, apply_defaults, and edge cases
  • 529 unit tests pass
  • 119 integration tests pass (persistent cache tests pinned to workspace cache dir)
  • 2 smoke tests pass

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added support for XDG cache directory standards with automatic fallback
    • Configuration can now be provided via JSON in initialization options
    • Pattern-based rules enable fine-grained compiler flag control per file
    • Enhanced compile_commands.json discovery with automatic fallback scanning

Restructure CliceConfig into nested ProjectConfig with per-file
compilation rules, XDG cache paths, initializationOptions support,
and auto-scanning CDB discovery.

- Add [[rules]] support with pre-compiled glob pattern matching
- Default cache/index/log paths to $XDG_CACHE_HOME/clice/<hash>/
- Accept config via LSP initializationOptions JSON
- Auto-scan workspace root + immediate subdirs for compile_commands.json
- Use eventide defaulted<T> for optional TOML/JSON fields
- Make GlobPattern::match() const

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

📝 Walkthrough

Walkthrough

The PR restructures configuration from a flat schema to a nested ProjectConfig model, introduces glob pattern–based per-file rule matching, refactors cache/index paths throughout the codebase, adds JSON-based config loading from LSP initialization options, and enhances compile command discovery with auto-scanning fallback.

Changes

Cohort / File(s) Summary
Configuration Restructuring
src/server/config.h, src/server/config.cpp
Refactored CliceConfig to nest ProjectConfig with compile_commands_paths (vector), cache_dir, index_dir, logging_dir, and optional indexing settings; added CompiledRule and ConfigRule for glob pattern rule matching; implemented match_rules() method; added load_from_json() for LSP initialization options; introduced resolve_xdg_cache_dir() for deterministic cache paths with workspace-root hashing; expanded apply_defaults to handle nested project fields and glob rule compilation.
Compiler and CDB Integration
src/server/compiler.cpp
Updated PCM/PCH/preamble cache paths from workspace.config.cache_dir to workspace.config.project.cache_dir; modified CDB lookup in fill_compile_args to apply match_rules() for the file being compiled and pass computed rule_append/rule_remove via CommandOptions to workspace.cdb.lookup(...).
Workspace and Server Integration
src/server/workspace.cpp, src/server/indexer.cpp, src/server/master_server.h, src/server/master_server.cpp
Updated all cache path accesses from config.cache_dir to config.project.cache_dir; changed indexing configuration references to use config.project fields; refactored compile commands discovery to iterate cfg.compile_commands_paths (supporting both file and directory entries) with auto-scan fallback; added init_options_json member to capture LSP initialization options; implemented config loading priority in Initialized notification (JSON from initializationOptions first, then workspace defaults); updated indexer.load/save and logging to use nested project paths.
Glob Pattern Const-Correctness
src/support/glob_pattern.h, src/support/glob_pattern.cpp
Qualified GlobPattern::match() as const to permit invocation on const instances.
Test Coverage
tests/unit/server/config_tests.cpp, tests/integration/compilation/test_persistent_cache.py
Added comprehensive unit tests for TOML/JSON configuration parsing, optional field handling, rule compilation, and match_rules() behavior; added integration test helper to pin cache directory to workspace via clice.toml.

Sequence Diagram

sequenceDiagram
    participant Client
    participant MasterServer
    participant ConfigLoader as Config System
    participant Workspace
    participant Compiler as Compiler (CDB)

    Client->>MasterServer: LSP initialize with<br/>initializationOptions
    MasterServer->>MasterServer: Store JSON in<br/>init_options_json
    Client->>MasterServer: LSP initialized notification
    
    MasterServer->>ConfigLoader: Attempt load_from_json<br/>(init_options_json)
    alt JSON config loaded
        ConfigLoader->>ConfigLoader: Parse JSON, apply defaults,<br/>compile glob rules
        ConfigLoader-->>MasterServer: CliceConfig with rules
    else JSON fails or absent
        MasterServer->>ConfigLoader: load_from_workspace<br/>(workspace_root)
        ConfigLoader-->>MasterServer: CliceConfig from TOML
    end
    
    MasterServer->>Workspace: Initialize with config
    MasterServer->>MasterServer: Clear init_options_json
    
    Client->>MasterServer: File opened/edited
    MasterServer->>Compiler: fill_compile_args(file_path)
    
    Compiler->>ConfigLoader: match_rules(file_path)
    ConfigLoader->>ConfigLoader: Check compiled glob<br/>patterns against path
    ConfigLoader-->>Compiler: append/remove flags
    
    Compiler->>Workspace: cdb.lookup(path,<br/>CommandOptions)
    Workspace-->>Compiler: Filtered CDB entries<br/>+ matched rules
    Compiler-->>MasterServer: Resolved compile args
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 A tidy warren of rules, now nested and neat,
With glob patterns dancing to make configs complete,
From flat to the forest, each file finds its way,
Match rules bloom like carrots—hip-hop, hooray! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.61% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: configuration file handling improvements including rules support, CDB auto-scan, initializationOptions, XDG defaults, and partial config support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch improve/config-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/server/compiler.cpp (1)

48-53: ⚠️ Potential issue | 🟠 Major

Apply match_rules() in the lazy module dependency resolver too.

fill_compile_args now applies rule-based flags, but at Line 49-53 the compile-graph resolver still does workspace.cdb.lookup without append/remove. That can produce inconsistent module dependency scanning vs actual compilation.

Suggested fix
     auto resolve = [this](std::uint32_t path_id) -> llvm::SmallVector<std::uint32_t> {
         auto file_path = workspace.path_pool.resolve(path_id);
-        auto results =
-            workspace.cdb.lookup(file_path, {.query_toolchain = true, .suppress_logging = true});
+        std::vector<std::string> rule_append, rule_remove;
+        workspace.config.match_rules(file_path, rule_append, rule_remove);
+        auto results = workspace.cdb.lookup(file_path,
+                                            {.query_toolchain = true,
+                                             .suppress_logging = true,
+                                             .remove = rule_remove,
+                                             .append = rule_append});
         if(results.empty())
             return {};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/compiler.cpp` around lines 48 - 53, The lazy module dependency
resolver (the resolve lambda) calls workspace.cdb.lookup without applying
rule-based flag adjustments, causing inconsistent scanning versus
fill_compile_args; modify resolve to call match_rules() (or otherwise apply
append/remove from the rule matcher) to the compile args returned by
workspace.cdb.lookup so the same rule-driven flags are appended/removed as in
fill_compile_args; locate the resolve lambda and ensure it uses the same
match_rules()/append/remove logic (or a helper used by fill_compile_args) before
returning the resolved compile args/results.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/config.cpp`:
- Around line 77-82: apply_defaults currently only performs variable
substitution via substitute_workspace for entries in p.compile_commands_paths
but doesn't anchor relative paths to the workspace root, so relative entries
like "build" resolve against the process CWD; after calling substitute_workspace
for each path in p.compile_commands_paths, check if the path is not absolute
(use std::filesystem::path::is_absolute) and if so replace it with
workspace_root / path (and normalize, e.g., lexically_normal) so
compile_commands_paths entries are anchored to workspace_root; update the loop
in apply_defaults that iterates p.compile_commands_paths to perform this
relative->absolute anchoring after substitution.

In `@src/server/config.h`:
- Around line 27-75: The new CliceConfig nests project settings but
deserialization still expects old flat keys, so add a migration shim in the
loading paths: update CliceConfig::load and CliceConfig::load_from_json to
detect legacy top-level keys (e.g., cache_dir, index_dir, logging_dir,
enable_indexing, idle_timeout_ms, clang_tidy, max_active_file,
stateful_worker_count, stateless_worker_count, worker_memory_limit,
compile_commands_paths) and move them into a ProjectConfig instance (or
translate them into a temporary JSON/TOML map) before populating
CliceConfig::project; ensure apply_defaults and match_rules continue to operate
on the migrated project and preserve backwards compatibility for
initializationOptions and clice.toml users.

In `@src/server/master_server.cpp`:
- Around line 101-121: The current auto-scan uses directory_iterator to pick the
first compile_commands.json and is non-deterministic; modify the scan inside the
directory iteration (where try_candidate, cdb_path, and workspace_root are used)
to collect all matching candidate paths into a vector, sort that vector (e.g.,
lexicographically), then choose a stable winner (e.g., the first element) or
log/warn if multiple matches exist before assigning cdb_path; ensure you still
short-circuit if a match was already found by the initial
try_candidate(workspace_root) check.

---

Outside diff comments:
In `@src/server/compiler.cpp`:
- Around line 48-53: The lazy module dependency resolver (the resolve lambda)
calls workspace.cdb.lookup without applying rule-based flag adjustments, causing
inconsistent scanning versus fill_compile_args; modify resolve to call
match_rules() (or otherwise apply append/remove from the rule matcher) to the
compile args returned by workspace.cdb.lookup so the same rule-driven flags are
appended/removed as in fill_compile_args; locate the resolve lambda and ensure
it uses the same match_rules()/append/remove logic (or a helper used by
fill_compile_args) before returning the resolved compile args/results.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d276d414-56a8-44bc-814e-826a6b37ab73

📥 Commits

Reviewing files that changed from the base of the PR and between d42d9d5 and 5f8b358.

📒 Files selected for processing (11)
  • src/server/compiler.cpp
  • src/server/config.cpp
  • src/server/config.h
  • src/server/indexer.cpp
  • src/server/master_server.cpp
  • src/server/master_server.h
  • src/server/workspace.cpp
  • src/support/glob_pattern.cpp
  • src/support/glob_pattern.h
  • tests/integration/compilation/test_persistent_cache.py
  • tests/unit/server/config_tests.cpp

Comment thread src/server/config.cpp
Comment on lines +77 to +82
// Variable substitution on string fields.
substitute_workspace(p.cache_dir, workspace_root);
substitute_workspace(p.index_dir, workspace_root);
substitute_workspace(p.logging_dir, workspace_root);
for(auto& path: p.compile_commands_paths)
substitute_workspace(path, workspace_root);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Anchor relative compile_commands_paths to workspace_root.

apply_defaults() only substitutes ${workspace} here. Later, src/server/master_server.cpp Lines 85-97 probes these entries as-is, so a config value like "build" is resolved against the server process CWD instead of the workspace root. That makes configured CDB discovery fail in common setups.

[suggested fix]

Patch
-    for(auto& path: p.compile_commands_paths)
-        substitute_workspace(path, workspace_root);
+    for(auto& cdb_path: p.compile_commands_paths) {
+        substitute_workspace(cdb_path, workspace_root);
+        if(!workspace_root.empty() && !llvm::sys::path::is_absolute(cdb_path))
+            cdb_path = path::join(workspace_root, cdb_path);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/config.cpp` around lines 77 - 82, apply_defaults currently only
performs variable substitution via substitute_workspace for entries in
p.compile_commands_paths but doesn't anchor relative paths to the workspace
root, so relative entries like "build" resolve against the process CWD; after
calling substitute_workspace for each path in p.compile_commands_paths, check if
the path is not absolute (use std::filesystem::path::is_absolute) and if so
replace it with workspace_root / path (and normalize, e.g., lexically_normal) so
compile_commands_paths entries are anchored to workspace_root; update the loop
in apply_defaults that iterates p.compile_commands_paths to perform this
relative->absolute anchoring after substitution.

Comment thread src/server/config.h
Comment on lines +27 to +75
/// Corresponds to the `[project]` section in clice.toml.
struct ProjectConfig {
defaulted<bool> clang_tidy = {};
defaulted<int> max_active_file = {};

defaulted<std::string> cache_dir;
defaulted<std::string> index_dir;
defaulted<std::string> logging_dir;

// Compilation database path (empty = auto-detect)
std::string compile_commands_path;
defaulted<std::vector<std::string>> compile_commands_paths;

// Cache directory (empty = default: <workspace>/.clice/)
std::string cache_dir;
std::optional<bool> enable_indexing;
std::optional<int> idle_timeout_ms;

// Index storage directory (default: <cache_dir>/index/)
std::string index_dir;
defaulted<std::uint32_t> stateful_worker_count = {};
defaulted<std::uint32_t> stateless_worker_count = {};
defaulted<std::uint64_t> worker_memory_limit = {};
};

struct CompiledRule {
std::vector<GlobPattern> patterns;
std::vector<std::string> append;
std::vector<std::string> remove;
};

// Logging directory (default: <cache_dir>/logs/)
std::string logging_dir;
/// Configuration for the clice LSP server, loadable from clice.toml
/// or passed via LSP initializationOptions.
struct CliceConfig {
defaulted<ProjectConfig> project;

// Background indexing
bool enable_indexing = true;
int idle_timeout_ms = 3000;
defaulted<std::vector<ConfigRule>> rules;

annotation<std::vector<CompiledRule>, serde_schema::skip> compiled_rules;

/// Compute default values for any field left at its zero/empty sentinel.
void apply_defaults(const std::string& workspace_root);

/// Collect append/remove flags from all rules whose patterns match `path`.
void match_rules(llvm::StringRef path,
std::vector<std::string>& append,
std::vector<std::string>& remove) const;

/// Try to load configuration from a TOML file.
/// Performs ${workspace} variable substitution in string fields.
/// Returns std::nullopt if the file does not exist or cannot be parsed.
static std::optional<CliceConfig> load(const std::string& path,
const std::string& workspace_root);

/// Try to load configuration from a JSON string (e.g. initializationOptions).
static std::optional<CliceConfig> load_from_json(llvm::StringRef json,
const std::string& workspace_root);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve the old flat config keys or migrate them explicitly.

CliceConfig now only exposes project and rules, while loading still deserializes straight into this type. That means existing clice.toml / initializationOptions payloads that still set cache_dir, enable_indexing, stateful_worker_count, etc. at the top level will stop taking effect after this upgrade. Please add compatibility aliases or a migration shim before landing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/config.h` around lines 27 - 75, The new CliceConfig nests project
settings but deserialization still expects old flat keys, so add a migration
shim in the loading paths: update CliceConfig::load and
CliceConfig::load_from_json to detect legacy top-level keys (e.g., cache_dir,
index_dir, logging_dir, enable_indexing, idle_timeout_ms, clang_tidy,
max_active_file, stateful_worker_count, stateless_worker_count,
worker_memory_limit, compile_commands_paths) and move them into a ProjectConfig
instance (or translate them into a temporary JSON/TOML map) before populating
CliceConfig::project; ensure apply_defaults and match_rules continue to operate
on the migrated project and preserve backwards compatibility for
initializationOptions and clice.toml users.

Comment on lines +101 to 121
// Auto-scan: workspace root + all immediate subdirectories.
if(cdb_path.empty()) {
for(auto* subdir: {"build", "cmake-build-debug", "cmake-build-release", "out", "."}) {
auto candidate = path::join(workspace_root, subdir, "compile_commands.json");
auto try_candidate = [&](llvm::StringRef dir) -> bool {
auto candidate = path::join(dir, "compile_commands.json");
if(llvm::sys::fs::exists(candidate)) {
cdb_path = std::move(candidate);
break;
return true;
}
return false;
};

if(!try_candidate(workspace_root)) {
std::error_code ec;
for(llvm::sys::fs::directory_iterator it(workspace_root, ec), end; it != end && !ec;
it.increment(ec)) {
if(it->type() == llvm::sys::fs::file_type::directory_file) {
if(try_candidate(it->path()))
break;
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make CDB auto-scan deterministic when multiple build dirs match.

This loop picks the first compile_commands.json returned by directory_iterator, and that order is filesystem-dependent. In a workspace with both build-debug/compile_commands.json and build-release/compile_commands.json, the loaded database can flip across runs or machines. Please collect the matches, sort them, and either pick a stable winner or warn on ambiguity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/master_server.cpp` around lines 101 - 121, The current auto-scan
uses directory_iterator to pick the first compile_commands.json and is
non-deterministic; modify the scan inside the directory iteration (where
try_candidate, cdb_path, and workspace_root are used) to collect all matching
candidate paths into a vector, sort that vector (e.g., lexicographically), then
choose a stable winner (e.g., the first element) or log/warn if multiple matches
exist before assigning cdb_path; ensure you still short-circuit if a match was
already found by the initial try_candidate(workspace_root) check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant