Skip to content

[rqd] Add windows support#2165

Merged
DiegoTavares merged 27 commits intoAcademySoftwareFoundation:masterfrom
DiegoTavares:rust_rqd_windows
Apr 2, 2026
Merged

[rqd] Add windows support#2165
DiegoTavares merged 27 commits intoAcademySoftwareFoundation:masterfrom
DiegoTavares:rust_rqd_windows

Conversation

@DiegoTavares
Copy link
Copy Markdown
Collaborator

@DiegoTavares DiegoTavares commented Jan 28, 2026

First take at implementing windows support for rqd. The goal was to have feature parity, but I
wasn't able to find a Windows tool to emulate taskset and allocate processes to threads. So this
stands as the only feature not available for windows.

Summary by CodeRabbit

  • New Features

    • Official native Windows runtime and system support added (Windows-specific system manager, process handling, and command wrappers).
  • CI/CD

    • CI now builds, tests, and publishes Windows executables alongside Linux/macOS; Windows-specific install and copy steps included.
  • Tests

    • Integration tests expanded and hardened for cross-platform behavior with improved readiness checks and Windows-aware helpers.
  • Documentation

    • Platform docs updated to list Windows as supported.
  • Refactor

    • Cross-platform defaults and dependency declarations adjusted for proper Unix/Windows behavior.

First take at implementing windows support for rqd. The goal was to have feature parity, but I
wasn't able to find a Windows tool to emulate taskset and allocate processes to threads. So this
stands as the only feature not available for windows.
@DiegoTavares
Copy link
Copy Markdown
Collaborator Author

Caveat: I currently don't have a windows environment to test this on. I'm relying on the Rust unit tests on github as my only quality assurance.

I need a volunteer to test this on a real environment and report bugs.

…for interface binding

Provide Windows-specific get_ip_for_interface implementation that
returns an error if binding by interface name is requested, since
this feature is unsupported on Windows. Add corresponding tests.
…ion tests

- Adjust test code to handle Windows binary naming and commands
- Load machine resource files conditionally for Unix-based tests
- Add Windows PowerShell memory_fork.cmd test script
- Mark frame execution and related tests as Unix-only where appropriate
- Add comment on run_as_user config option availability on Unix systems only
Use a function to wait for openrqd to open its gRPC port before
proceeding, ensuring tests do not start too early and improving
reliability and speed of the integration tests.
…indows

- Refactor integration tests to normalize paths for Windows compatibility
- Correct Rust RQD docs to reflect Windows support status as available
Use ephemeral free ports for gRPC servers instead of fixed ports to avoid conflicts.
Serialize integration tests with a global mutex to prevent parallel execution issues.
Replace fixed hostnames with "127.0.0.1" for consistency.
@DiegoTavares
Copy link
Copy Markdown
Collaborator Author

@lithorus or @KernAttila do you guys have a Windows machine on hand to give the rust/rqd a spin and report back if you find any errors?

@lithorus
Copy link
Copy Markdown
Collaborator

@lithorus or @KernAttila do you guys have a Windows machine on hand to give the rust/rqd a spin and report back if you find any errors?

I'm affraid I don't have one.

@KernAttila
Copy link
Copy Markdown
Contributor

@DiegoTavares I'm testing on Windows right now.

@DiegoTavares
Copy link
Copy Markdown
Collaborator Author

@KernAttila Any updates on the Windows tests?

Also filter out signal dependend functions not really available on Windows.

@author: Kern Attila GERMAIN <5556461+KernAttila@users.noreply.github.com>
Signed-off-by: Diego Tavares <dtavares@imageworks.com>
@DiegoTavares DiegoTavares marked this pull request as ready for review March 24, 2026 22:07
@DiegoTavares
Copy link
Copy Markdown
Collaborator Author

@KernAttila Your commits have been added. Thanks for the contribution.

Comment thread rust/crates/rqd/src/frame/logging.rs
Comment thread rust/crates/scheduler/resources/cleanup_test_data.sql Outdated
Adds x86_64-pc-windows-msvc as a build target to both the packaging and release CI pipelines, with appropriate Windows-specific dependency installation, binary copying, and release asset upload steps
@ramonfigueiredo
Copy link
Copy Markdown
Collaborator

HI @DiegoTavares , thanks for your contributions!

I am reviewing this!

We need a Windows machine to test these new changes!

@DiegoTavares
Copy link
Copy Markdown
Collaborator Author

HI @DiegoTavares , thanks for your contributions!

I am reviewing this!

We need a Windows machine to test these new changes!

We won't get a windows machine to test this on time for the next release. Kern worked on testing this and suggested some changes with his findings that have been applied.

This PR contains a windows version of the unit tests for rqd and they have been passing successfully.

With both the unit tests and Kerns tests I think we're good to go for an initial release, adding a caveat to the release notes that the windows variant hasn't been tested on production.

Copy link
Copy Markdown
Collaborator

@ramonfigueiredo ramonfigueiredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DiegoTavares

Thanks for your excellent work!

Approved with some recommended changes.

Comment thread rust/crates/rqd/src/system/windows.rs Outdated
Comment thread rust/crates/rqd/src/frame/frame_cmd.rs
Comment thread rust/crates/rqd/src/system/windows.rs
Comment thread rust/crates/rqd/src/system/windows.rs Outdated
Comment thread rust/crates/rqd/src/frame/running_frame.rs
Comment thread rust/crates/rqd/tests/rqd_integration_tests.rs Outdated
Comment thread .github/workflows/rust-pipeline.yml
Comment thread rust/crates/rqd/tests/rqd_integration_tests.rs Outdated
Comment thread rust/crates/rqd/src/system/windows.rs
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds comprehensive Windows support across CI/release workflows, build/test jobs, a Windows System backend, Windows-aware frame execution and logging, OS-conditional dependencies, and cross-platform test and integration adjustments.

Changes

Cohort / File(s) Summary
CI / Releases
.github/workflows/packaging-pipeline.yml, .github/workflows/release-pipeline.yml, .github/workflows/rust-pipeline.yml
Add Windows build matrix/job target (x86_64-pc-windows-msvc on Windows runners), Windows-specific protoc install via Chocolatey, shell overrides, Windows binary copy/upload steps, and new windows-clippy/windows-tests jobs.
System backend & platform modules
rust/crates/rqd/src/system/mod.rs, rust/crates/rqd/src/system/windows.rs, rust/crates/rqd/src/system/machine.rs
Add windows module and WindowsSystem implementation; integrate Windows init into MachineMonitor; adjust linux/macos cfg gating and imports.
Frame execution & logging
rust/crates/rqd/src/frame/frame_cmd.rs, rust/crates/rqd/src/frame/running_frame.rs, rust/crates/rqd/src/frame/logging.rs
Implement Windows FrameCmdBuilder producing .bat entrypoints; OS-specific run/wait/recovery/exit-code interpretation; add Windows raw-file stdout/stderr handling; adjust ownership/hostname logic.
Cargo / build helpers
rust/crates/opencue-proto/build.rs, rust/crates/dummy-cuebot/Cargo.toml, rust/crates/rqd/Cargo.toml
Move users/nix into Unix-only deps, move pnet off Windows, and make protobuf protos_dir resolution robust on Windows symlink placeholders.
Runtime & config tweaks
rust/crates/rqd/src/config/mod.rs, rust/config/rqd.yaml, docs/_docs/reference/rust-rqd.md
Make default shell and snapshot path selection OS-aware; annotate run_as_user as Unix-only in YAML comments; update docs to list Windows as supported.
Servant / interface helpers
rust/crates/rqd/src/servant/mod.rs, rust/crates/dummy-cuebot/src/main.rs
OS-conditional network interface binding: non-Windows retains pnet lookup, Windows returns 0.0.0.0 or errors for named-interface binding; add current_uid_if_requested with Unix/no-op variants.
Tests & test scripts
rust/crates/rqd/tests/rqd_integration_tests.rs, rust/crates/rqd/resources/test_scripts/memory_fork.cmd
Refactor integration tests for cross-platform execution (dynamic ports, path/arg normalization, readiness checks, test serialization); add Windows memory_fork.cmd test script.
Scheduler & platform guards
rust/crates/scheduler/src/main.rs, rust/crates/rqd/src/system/macos.rs
Gate Unix signal handling behind #[cfg(unix)]; remove monitored_sessions field from macOS system implementation.
Miscellaneous
docs/_docs/..., .../Cargo.toml, rust/crates/opencue-proto/...
Small comment/newline edits and cross-platform compatibility tweaks across build scripts and docs.

Sequence Diagram

sequenceDiagram
    participant User
    participant RunningFrame as RunningFrame
    participant Builder as FrameCmdBuilder
    participant Entrypoint as Entrypoint\n(.bat/.sh)
    participant Process as Child\nProcess
    participant Logger as Logger

    User->>RunningFrame: run_inner(logger)
    RunningFrame->>Builder: new(shell, entrypoint_path)
    Builder->>Builder: prepare command string / entrypoint
    Builder->>Entrypoint: write generated script (.bat / .sh)
    Builder-->>RunningFrame: return Command + script_path

    RunningFrame->>RunningFrame: setup_raw_file/stdout_stderr
    RunningFrame->>Process: spawn(command)
    RunningFrame->>Logger: start log piping
    Process-->>RunningFrame: exit_status
    RunningFrame->>RunningFrame: interprete_output(exit_status)
    RunningFrame->>Logger: log completion
    RunningFrame-->>User: return (exit_code, exit_signal)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

"🐰
I hopped through CI and patched the seams,
Wrote .bat and .sh and Windows dreams,
Sessions, processes, tests aligned with care,
Cross-platform hops now run everywhere.
🥕"

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[rqd] Add windows support' directly describes the main objective of the pull request—implementing Windows support for the rqd component. It is specific, concise, and clearly summarizes the primary change across the extensive changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (3)
.github/workflows/release-pipeline.yml (1)

133-135: Consider setting PROTOC environment variable for consistency.

Same observation as in packaging-pipeline.yml - the protoc installation doesn't explicitly set the PROTOC environment variable, unlike rust-pipeline.yml.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release-pipeline.yml around lines 133 - 135, The Windows
installation step "Install dependencies (Windows)" installs protoc via choco but
doesn't set the PROTOC environment variable; update that step to export/set
PROTOC to the installed protoc binary path (e.g., the choco install location) so
downstream steps can reference $PROTOC (or %PROTOC%) consistently; change the
"Install dependencies (Windows)" step that currently runs "choco install protoc
-y" to also determine the installed protoc path and set the PROTOC environment
variable (mirror how PROTOC is set in rust-pipeline.yml) so other steps use the
canonical PROTOC symbol.
.github/workflows/packaging-pipeline.yml (1)

96-98: Consider setting PROTOC environment variable for consistency.

The rust-pipeline.yml explicitly sets PROTOC=C:\ProgramData\chocolatey\bin\protoc.exe after installing protoc. This step only installs protoc but doesn't set the environment variable. While it may work if protoc is in PATH, explicitly setting PROTOC would ensure consistency across pipelines.

       - name: Install dependencies (Windows)
         if: matrix.os == 'windows-2022'
         run: choco install protoc -y
+
+      - name: Export PROTOC path (Windows)
+        if: matrix.os == 'windows-2022'
+        run: echo "PROTOC=C:\ProgramData\chocolatey\bin\protoc.exe" >> $env:GITHUB_ENV
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/packaging-pipeline.yml around lines 96 - 98, The Windows
install step "Install dependencies (Windows)" only runs choco install protoc;
update that step to also set the PROTOC environment variable explicitly to the
installed binary (e.g., C:\ProgramData\chocolatey\bin\protoc.exe) so downstream
steps see a consistent value. In the "Install dependencies (Windows)" job/step
(the step name string), either add an env entry PROTOC:
'C:\ProgramData\chocolatey\bin\protoc.exe' or append the path to GITHUB_ENV
(echoing PROTOC into the environment) immediately after installing protoc.
rust/crates/rqd/tests/rqd_integration_tests.rs (1)

250-265: Free-port probing still leaves a bind race.

These helpers release the socket before dummy-cuebot or openrqd bind to it, so another process can reclaim the port and make the suite flaky on busy runners. If binding to 0 is not possible for the test binaries, consider retrying the spawn/allocation path when startup fails with an address-in-use error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/tests/rqd_integration_tests.rs` around lines 250 - 265, The
current get_free_port/get_two_free_ports helpers create a race by releasing
ephemeral ports before test binaries bind; instead implement a robust
retry-on-bind-failure strategy around the process spawn/allocation path (where
dummy-cuebot or openrqd are started) rather than pre-probing ports: remove or
stop relying on get_free_port/get_two_free_ports and change the test startup
code that launches these processes to retry spawning (with backoff and a small
max attempts) when the child fails to start with an address-in-use error
(EADDRINUSE), e.g., wrap the spawn logic used for dummy-cuebot/openrqd in a loop
that catches the bind error and retries. Ensure the retry loop logs attempts and
eventually fails the test if the maximum retries are exhausted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/_docs/reference/rust-rqd.md`:
- Around line 324-328: Update the "Multi-platform support: Linux, MacOS and
Windows" bullet in the "Core Functionality" section to indicate Windows is
initial/experimental (e.g., "Multi-platform support: Linux, macOS and
initial/experimental Windows support") and add a brief caveats sub-bullet or
sentence referencing the current limitations: missing CPU-affinity binding and
lack of production validation on real Windows hosts, and that the remaining docs
(artifacts and service setup) currently only cover Linux/macOS; ensure the
change targets the "Core Functionality" heading and the existing bullet text so
readers see the softened claim and explicit caveats.

In `@rust/crates/opencue-proto/build.rs`:
- Around line 20-32: The build script resolves the actual protos directory into
the variable protos_dir (the PathBuf computed in the block assigning protos_dir)
but only emits cargo:rerun-if-changed for the original literal path; add an
additional cargo:rerun-if-changed directive for the resolved protos_dir
(converted to a string) after it is computed so Cargo watches the real directory
for additions/deletions (ensure you emit this before returning/using protos_dir
and use the same string form you would pass to
println!("cargo:rerun-if-changed={}", ...)).

In `@rust/crates/rqd/resources/test_scripts/memory_fork.cmd`:
- Around line 1-15: The batch script memory_fork.cmd uses caret continuations
and must be committed with CRLF line endings to avoid brittle behavior on
Windows; update the file to use CRLF endings (ensure your editor/git config
converts this file to CRLF) or add a .gitattributes entry (e.g., matching *.cmd)
that enforces text eol=crlf so memory_fork.cmd is always normalized on commit;
verify the file now contains CRLF line endings and re-run CI to confirm the
caret continuations work on Windows.

In `@rust/crates/rqd/src/frame/frame_cmd.rs`:
- Around line 163-188: The Windows build() implementation in frame_cmd.rs
ignores the configured shell and always replaces the process with
Command::new(&self.entrypoint_file_path); update build() (or related caller) to
honor self.shell: if self.shell is Some(shell_path) then create the Command
using that interpreter (e.g., Command::new(shell_path) with appropriate args to
run the generated batch/script file) and preserve end_cmd/script generation,
otherwise if the configured shell is present but not cmd.exe reject with an Err
explaining unsupported shell on Windows; ensure you set self.cmd to the new
Command and return (&mut self.cmd, script.clone()) so
runner.shell_path/self.shell is actually used.
- Around line 259-263: The Windows stub of FrameCmd::with_exit_file drops the
provided path which prevents recoveries from persisting exit codes; change the
Windows implementation in frame_cmd.rs so with_exit_file(&mut self,
exit_file_path: String) sets self.exit_file_path = Some(exit_file_path) and
returns self (matching the non-Windows implementation) so calls from
running_frame.rs (with_exit_file(...) and subsequent read_exit_file()) can find
the exit file during recovery; keep the existing #[cfg(target_os = "windows")]
but ensure the field name and types match the struct's exit_file_path.

In `@rust/crates/rqd/src/frame/running_frame.rs`:
- Around line 699-723: The Windows branch of run_inner currently ignores
self.config.run_as_user and will silently run frames as the rqd account; add a
guard in run_inner that checks self.config.run_as_user on Windows (cfg!(windows)
or target-specific branch) and returns an error (fail fast) if run_as_user is
enabled, so we don't proceed to build/spawn the command; update the error
message to reference the frame ID or entrypoint (use self.request or
self.entrypoint_file_path) and ensure the check is performed before building or
spawning the process (place it near the start of run_inner, before
FrameCmdBuilder::build() / cmd.spawn()).

In `@rust/crates/rqd/src/system/windows.rs`:
- Around line 456-458: kill_session_internal currently walks session_processes
(via collect_lineage) without refreshing the cached process list, so newly
spawned children can be missed; fix this by calling refresh_procs() (or the
method that repopulates session_processes) before collecting the
lineage/iterating processes in kill_session_internal (ensure any error from
refresh_procs() is handled or propagated), referencing kill_session_internal,
collect_lineage, and session_processes to locate where to insert the refresh.
- Around line 533-541: The function create_user_if_unexisting currently logs a
warning but returns Ok(uid) when the requested username differs from the current
Windows user; change it to return an Err instead so callers (e.g., the frame
launch path in frame/manager.rs around the code that checks this result) fail
fast. Specifically, in create_user_if_unexisting(&self, username: &str, uid:
u32, _gid: u32) -> Result<u32>, detect when current_user != username and return
a suitable error (with a clear message like "user switching not supported on
Windows: requested '<username>'") instead of Ok(uid); update any error
construction to use the crate's existing error type so the caller's existing
Result handling works. Ensure no other code treats the previous Ok(uid) case as
success without handling the new Err.
- Around line 629-635: The reboot function currently discards the ExitStatus and
treats any completed shutdown invocation as success; update the
Command::new("shutdown").args(["/r", "/t", "0"]).status() handling in reboot to
check the returned ExitStatus via status.success() and return an Err (using
into_diagnostic()/wrap_err or similar) when it is false so non-zero exit codes
(e.g., access denied) are propagated as failures instead of being mapped to
Ok(()). Ensure you still convert command errors to diagnostics and keep the
existing wrap_err message context ("Failed to reboot").
- Around line 221-228: The returned value currently uses SocketAddr::to_string()
which includes the port; update the branch guarded by use_ip_as_hostname to
return only the IP by replacing the final Ok(addr.to_string()) with
Ok(addr.ip().to_string()); keep the existing resolution logic (format!(...):443,
to_socket_addrs, addrs_iter, addr) and error handling unchanged so downstream
callers (e.g., jobhost/Docker hostname) receive a plain IP string.

---

Nitpick comments:
In @.github/workflows/packaging-pipeline.yml:
- Around line 96-98: The Windows install step "Install dependencies (Windows)"
only runs choco install protoc; update that step to also set the PROTOC
environment variable explicitly to the installed binary (e.g.,
C:\ProgramData\chocolatey\bin\protoc.exe) so downstream steps see a consistent
value. In the "Install dependencies (Windows)" job/step (the step name string),
either add an env entry PROTOC: 'C:\ProgramData\chocolatey\bin\protoc.exe' or
append the path to GITHUB_ENV (echoing PROTOC into the environment) immediately
after installing protoc.

In @.github/workflows/release-pipeline.yml:
- Around line 133-135: The Windows installation step "Install dependencies
(Windows)" installs protoc via choco but doesn't set the PROTOC environment
variable; update that step to export/set PROTOC to the installed protoc binary
path (e.g., the choco install location) so downstream steps can reference
$PROTOC (or %PROTOC%) consistently; change the "Install dependencies (Windows)"
step that currently runs "choco install protoc -y" to also determine the
installed protoc path and set the PROTOC environment variable (mirror how PROTOC
is set in rust-pipeline.yml) so other steps use the canonical PROTOC symbol.

In `@rust/crates/rqd/tests/rqd_integration_tests.rs`:
- Around line 250-265: The current get_free_port/get_two_free_ports helpers
create a race by releasing ephemeral ports before test binaries bind; instead
implement a robust retry-on-bind-failure strategy around the process
spawn/allocation path (where dummy-cuebot or openrqd are started) rather than
pre-probing ports: remove or stop relying on get_free_port/get_two_free_ports
and change the test startup code that launches these processes to retry spawning
(with backoff and a small max attempts) when the child fails to start with an
address-in-use error (EADDRINUSE), e.g., wrap the spawn logic used for
dummy-cuebot/openrqd in a loop that catches the bind error and retries. Ensure
the retry loop logs attempts and eventually fails the test if the maximum
retries are exhausted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e56f0811-4147-4f9f-a33a-3d81221712fc

📥 Commits

Reviewing files that changed from the base of the PR and between 0d9b7cb and 03c1bc5.

📒 Files selected for processing (20)
  • .github/workflows/packaging-pipeline.yml
  • .github/workflows/release-pipeline.yml
  • .github/workflows/rust-pipeline.yml
  • docs/_docs/reference/rust-rqd.md
  • rust/config/rqd.yaml
  • rust/crates/dummy-cuebot/Cargo.toml
  • rust/crates/dummy-cuebot/src/main.rs
  • rust/crates/opencue-proto/build.rs
  • rust/crates/rqd/Cargo.toml
  • rust/crates/rqd/resources/test_scripts/memory_fork.cmd
  • rust/crates/rqd/src/config/mod.rs
  • rust/crates/rqd/src/frame/frame_cmd.rs
  • rust/crates/rqd/src/frame/logging.rs
  • rust/crates/rqd/src/frame/running_frame.rs
  • rust/crates/rqd/src/servant/mod.rs
  • rust/crates/rqd/src/system/machine.rs
  • rust/crates/rqd/src/system/mod.rs
  • rust/crates/rqd/src/system/windows.rs
  • rust/crates/rqd/tests/rqd_integration_tests.rs
  • rust/crates/scheduler/src/main.rs

Comment thread docs/_docs/reference/rust-rqd.md
Comment thread rust/crates/opencue-proto/build.rs
Comment thread rust/crates/rqd/resources/test_scripts/memory_fork.cmd
Comment thread rust/crates/rqd/src/frame/frame_cmd.rs
Comment thread rust/crates/rqd/src/frame/frame_cmd.rs
Comment thread rust/crates/rqd/src/frame/running_frame.rs Outdated
Comment thread rust/crates/rqd/src/system/windows.rs Outdated
Comment thread rust/crates/rqd/src/system/windows.rs
Comment thread rust/crates/rqd/src/system/windows.rs
Comment thread rust/crates/rqd/src/system/windows.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
rust/crates/rqd/src/system/windows.rs (4)

227-234: ⚠️ Potential issue | 🟠 Major

Use addr.ip().to_string() to return only the IP address.

When use_ip_as_hostname is true, SocketAddr::to_string() returns the socket address including the port (e.g., 192.168.1.1:443). Downstream consumers expect a plain hostname/IP without a port.

🐛 Proposed fix
         let addr = addrs_iter
             .next()
             .ok_or_else(|| miette::miette!("Failed to find IP for {}", hostname))?;
-        Ok(addr.to_string())
+        Ok(addr.ip().to_string())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 227 - 234, When
use_ip_as_hostname is true in the Windows host-resolution block, the code
currently returns addr.to_string() (which includes the port); change it to
return only the IP by calling addr.ip().to_string() instead. Locate the
variables addrs_iter and addr in the block that constructs format!("{}:443",
hostname) and replace the final Ok(addr.to_string()) with
Ok(addr.ip().to_string()) so downstream consumers receive a plain IP/hostname
without the port.

460-465: ⚠️ Potential issue | 🟠 Major

Refresh the process cache before collecting lineage.

kill_session_internal walks session_processes via collect_lineage without refreshing the cache first. Children spawned since the last refresh_procs() call will be invisible and survive the kill.

🐛 Proposed fix
 fn kill_session_internal(&self, session_pid: u32) -> Result<()> {
+    self.refresh_procs_cache();
     let lineage = self.collect_lineage(session_pid);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 460 - 465,
kill_session_internal calls collect_lineage on session_pid using the cached
session_processes but never refreshes the cache first, so newly spawned children
can be missed; before calling collect_lineage in kill_session_internal, call
refresh_procs() (or the appropriate cache-refresh method) and handle any errors
it returns, then proceed to collect_lineage(session_pid) and the rest of the
kill logic so the lineage reflects the latest process tree.

537-546: ⚠️ Potential issue | 🟠 Major

Fail fast when user switching is requested but unsupported.

When username differs from the current Windows user, this logs a warning but still returns Ok(uid). The frame manager treats this as success and launches the frame under the wrong account. Additionally, Windows usernames are case-insensitive, so the comparison should use eq_ignore_ascii_case.

🐛 Proposed fix
 fn create_user_if_unexisting(&self, username: &str, uid: u32, _gid: u32) -> Result<u32> {
     let current_user = std::env::var("USERNAME").unwrap_or_default();
-    if !current_user.is_empty() && current_user != username {
-        warn!(
-            "User creation is not supported on Windows. Requested user '{}' will be ignored.",
-            username
-        );
+    if !current_user.is_empty() && !current_user.eq_ignore_ascii_case(username) {
+        return Err(miette!(
+            "User switching is not supported on Windows. Requested user '{}' differs from current user '{}'.",
+            username,
+            current_user
+        ));
     }
     Ok(uid)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 537 - 546, The
create_user_if_unexisting function currently warns but returns Ok(uid) when
username differs, causing the process to run under the wrong account; change the
logic to compare usernames using case-insensitive comparison
(username.eq_ignore_ascii_case(&current_user)) and if they differ return an
Err(Result<u32> error) indicating user switching is unsupported on Windows
(instead of Ok(uid)); also replace or accompany the warn! with an error-level
message that includes both the requested username and the current user so
callers can fail fast.

633-640: ⚠️ Potential issue | 🟠 Major

Check the shutdown command exit status.

The current code discards the ExitStatus, treating all completed invocations as success—including access-denied or policy failures. Use status.success() to distinguish successful from failed shutdown attempts.

🐛 Proposed fix
 fn reboot(&self) -> Result<()> {
-    Command::new("shutdown")
+    let status = Command::new("shutdown")
         .args(["/r", "/t", "0"])
         .status()
-        .map(|_| ())
         .into_diagnostic()
-        .wrap_err("Failed to reboot")
+        .wrap_err("Failed to execute shutdown command")?;
+
+    if status.success() {
+        Ok(())
+    } else {
+        Err(miette!("shutdown command failed with exit code: {}", status))
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 633 - 640, The reboot
function currently ignores the Command::status ExitStatus; update the reboot
method (function reboot in windows.rs) to call
Command::new("shutdown").args(["/r", "/t", "0"]).status(), check the returned
ExitStatus with .success(), and return Ok(()) only when success() is true;
otherwise convert the failure into an Err with a diagnostic (e.g., include the
ExitStatus or its code in the error) so access-denied or policy failures are
surfaced instead of treated as success.
🧹 Nitpick comments (1)
rust/crates/rqd/src/system/windows.rs (1)

23-23: Remove unused DashSet import.

DashSet is imported but never used in this file (it was likely left behind after monitored_sessions was removed). This will trigger a compiler warning.

🔧 Suggested fix
-use dashmap::{DashMap, DashSet};
+use dashmap::DashMap;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` at line 23, The file imports DashSet
but never uses it; remove the unused symbol from the import by changing the use
statement that currently references DashMap and DashSet so it only imports
DashMap (i.e., update the line containing use dashmap::{DashMap, DashSet}; to
drop DashSet), ensuring no other references to DashSet (e.g.,
monitored_sessions) remain in functions or types like in windows.rs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust/crates/rqd/src/system/windows.rs`:
- Around line 560-562: Fix the typo in the debug log string in windows.rs:
change "Collect frame stats fo {}. rss: {}kb virtual: {}kb gpu: {}kb" to use
"for" instead of "fo" in the debug! call that references pid and session_data
(rss, virtual_memory, gpu_memory) so the log reads "Collect frame stats for
...".
- Around line 438-458: The reduce fallback sets the min start to u64::MAX when
all lineage entries were filtered out, producing a bogus start_time; after the
.reduce(...).unwrap_or((...)) step inspect the resulting min-start value (the
fourth tuple element) and if it equals u64::MAX treat it as “no live processes”
and return None instead of constructing a SessionData; update the caller/return
type accordingly (where SessionData is built) so collect_lineage/this function
returns Option<SessionData> when no live children remain and ensure any code
using start_time handles the Option.

---

Duplicate comments:
In `@rust/crates/rqd/src/system/windows.rs`:
- Around line 227-234: When use_ip_as_hostname is true in the Windows
host-resolution block, the code currently returns addr.to_string() (which
includes the port); change it to return only the IP by calling
addr.ip().to_string() instead. Locate the variables addrs_iter and addr in the
block that constructs format!("{}:443", hostname) and replace the final
Ok(addr.to_string()) with Ok(addr.ip().to_string()) so downstream consumers
receive a plain IP/hostname without the port.
- Around line 460-465: kill_session_internal calls collect_lineage on
session_pid using the cached session_processes but never refreshes the cache
first, so newly spawned children can be missed; before calling collect_lineage
in kill_session_internal, call refresh_procs() (or the appropriate cache-refresh
method) and handle any errors it returns, then proceed to
collect_lineage(session_pid) and the rest of the kill logic so the lineage
reflects the latest process tree.
- Around line 537-546: The create_user_if_unexisting function currently warns
but returns Ok(uid) when username differs, causing the process to run under the
wrong account; change the logic to compare usernames using case-insensitive
comparison (username.eq_ignore_ascii_case(&current_user)) and if they differ
return an Err(Result<u32> error) indicating user switching is unsupported on
Windows (instead of Ok(uid)); also replace or accompany the warn! with an
error-level message that includes both the requested username and the current
user so callers can fail fast.
- Around line 633-640: The reboot function currently ignores the Command::status
ExitStatus; update the reboot method (function reboot in windows.rs) to call
Command::new("shutdown").args(["/r", "/t", "0"]).status(), check the returned
ExitStatus with .success(), and return Ok(()) only when success() is true;
otherwise convert the failure into an Err with a diagnostic (e.g., include the
ExitStatus or its code in the error) so access-denied or policy failures are
surfaced instead of treated as success.

---

Nitpick comments:
In `@rust/crates/rqd/src/system/windows.rs`:
- Line 23: The file imports DashSet but never uses it; remove the unused symbol
from the import by changing the use statement that currently references DashMap
and DashSet so it only imports DashMap (i.e., update the line containing use
dashmap::{DashMap, DashSet}; to drop DashSet), ensuring no other references to
DashSet (e.g., monitored_sessions) remain in functions or types like in
windows.rs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4bbe139e-84cb-4e73-bdfd-6ac62df89bae

📥 Commits

Reviewing files that changed from the base of the PR and between 03c1bc5 and 515cffa.

📒 Files selected for processing (4)
  • .github/workflows/rust-pipeline.yml
  • rust/crates/rqd/src/system/macos.rs
  • rust/crates/rqd/src/system/windows.rs
  • rust/crates/rqd/tests/rqd_integration_tests.rs
💤 Files with no reviewable changes (1)
  • rust/crates/rqd/src/system/macos.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • rust/crates/rqd/tests/rqd_integration_tests.rs
  • .github/workflows/rust-pipeline.yml

Comment thread rust/crates/rqd/src/system/windows.rs
Comment thread rust/crates/rqd/src/system/windows.rs
DiegoTavares and others added 2 commits April 2, 2026 15:16
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
rust/crates/rqd/src/frame/frame_cmd.rs (1)

64-75: Windows new() is identical to Unix version — consider consolidating.

The Windows implementation of new() is byte-for-byte identical to the #[cfg(any(target_os = "linux", target_os = "macos"))] version (lines 51-62). This duplication increases maintenance burden.

♻️ Suggested consolidation

Remove the platform-specific #[cfg] attributes and keep a single implementation:

-    #[cfg(any(target_os = "linux", target_os = "macos"))]
     pub fn new(shell: &String, entrypoint_file_path: String) -> Self {
         let cmd = Command::new(shell);
         Self {
             cmd,
             shell: shell.clone(),
             exit_file_path: None,
             become_user: None,
             entrypoint_file_path,
             end_cmd: None,
         }
     }
-
-    #[cfg(target_os = "windows")]
-    pub fn new(shell: &String, entrypoint_file_path: String) -> Self {
-        let cmd = Command::new(shell);
-        Self {
-            cmd,
-            shell: shell.clone(),
-            exit_file_path: None,
-            become_user: None,
-            entrypoint_file_path,
-            end_cmd: None,
-        }
-    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/frame/frame_cmd.rs` around lines 64 - 75, The
Windows-specific new() implementation is identical to the Unix one (function new
in frame_cmd.rs), so remove the duplicated cfg-guarded block and keep a single
platform-agnostic implementation: delete the #[cfg(target_os = "windows")]
new(...) and the surrounding cfg attribute, or alternatively remove both
#[cfg(...)] attributes and retain one canonical pub fn new(shell: &String,
entrypoint_file_path: String) -> Self implementation; ensure the retained new()
initializes cmd, shell, exit_file_path, become_user, entrypoint_file_path, and
end_cmd exactly as in the duplicated versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust/crates/rqd/src/frame/frame_cmd.rs`:
- Around line 187-191: The batch string currently interpolates
self.exit_file_path and cmd_str directly into script (the local variable script)
which lets Windows batch metacharacters (especially %) break execution; sanitize
before formatting: create sanitized_exit = exit_file_path.replace("%","%%") and
sanitized_cmd = cmd_str.replace("%","%%") (to preserve literal % in batch), then
wrap the exit path in double quotes when used in the redirection/echo (use
"sanitized_exit") to protect spaces/special chars; finally use these sanitized
variables in the format! that builds script (update the code that assigns script
where self.exit_file_path and cmd_str are referenced).

---

Nitpick comments:
In `@rust/crates/rqd/src/frame/frame_cmd.rs`:
- Around line 64-75: The Windows-specific new() implementation is identical to
the Unix one (function new in frame_cmd.rs), so remove the duplicated
cfg-guarded block and keep a single platform-agnostic implementation: delete the
#[cfg(target_os = "windows")] new(...) and the surrounding cfg attribute, or
alternatively remove both #[cfg(...)] attributes and retain one canonical pub fn
new(shell: &String, entrypoint_file_path: String) -> Self implementation; ensure
the retained new() initializes cmd, shell, exit_file_path, become_user,
entrypoint_file_path, and end_cmd exactly as in the duplicated versions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 29954fd6-19b6-412f-a7e2-e7b2ca89964b

📥 Commits

Reviewing files that changed from the base of the PR and between 515cffa and 9e3bf21.

📒 Files selected for processing (2)
  • rust/crates/rqd/src/frame/frame_cmd.rs
  • rust/crates/rqd/src/frame/running_frame.rs

Comment thread rust/crates/rqd/src/frame/frame_cmd.rs
DiegoTavares and others added 3 commits April 2, 2026 15:34
Apply fix from AcademySoftwareFoundation@bfc8a96 on the windows counterpart

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
rust/crates/rqd/src/system/windows.rs (2)

634-640: ⚠️ Potential issue | 🟠 Major

Check the shutdown exit status before returning Ok(()).

.status() succeeding here only means the command launched. .map(|_| ()) turns access-denied or policy failures into Ok(()), so callers can think a reboot was accepted when Windows rejected it.

Rust std::process::Command::status ExitStatus::success behavior for non-zero exit codes
Suggested fix
 fn reboot(&self) -> Result<()> {
-    Command::new("shutdown")
+    let status = Command::new("shutdown")
         .args(["/r", "/t", "0"])
         .status()
-        .map(|_| ())
         .into_diagnostic()
-        .wrap_err("Failed to reboot")
+        .wrap_err("Failed to reboot")?;
+
+    if status.success() {
+        Ok(())
+    } else {
+        Err(miette!("shutdown exited unsuccessfully: {}", status))
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 634 - 640, The reboot
function currently treats any successful spawn of Command::new("shutdown") as
success; instead call .status(), inspect the returned ExitStatus with
ExitStatus::success(), and only return Ok(()) when success is true; if not,
convert the non-success status into an Err (using into_diagnostic() / wrap_err
with the status or code included) so callers see that the shutdown command
failed; update the logic in reboot that invokes
Command::new("shutdown").args(["/r","/t","0"]).status() to check
ExitStatus::success() and propagate a descriptive error when it is false.

538-546: ⚠️ Potential issue | 🟠 Major

Fail fast when run_as_user cannot be honored.

Returning Ok(uid) after only logging a warning makes run_as_user look successful even though the frame will still launch as the current Windows account.

Suggested fix
 fn create_user_if_unexisting(&self, username: &str, uid: u32, _gid: u32) -> Result<u32> {
     let current_user = std::env::var("USERNAME").unwrap_or_default();
-    if !current_user.is_empty() && current_user != username {
+    if !current_user.is_empty() && !current_user.eq_ignore_ascii_case(username) {
         warn!(
-            "User creation is not supported on Windows. Requested user '{}' will be ignored.",
-            username
+            "run_as_user is not supported on Windows: requested '{}', current '{}'",
+            username,
+            current_user
         );
+        return Err(miette!("run_as_user is not supported on Windows"));
     }
     Ok(uid)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 538 - 546, The function
create_user_if_unexisting currently logs a warning when the requested username
differs from the current Windows account but then returns Ok(uid), which falsely
signals success; change the control flow so that when current_user is non-empty
and != username the function returns an Err with a clear, descriptive error
(e.g., "cannot honor run_as_user on Windows: requested user 'X' differs from
current user 'Y'") instead of Ok(uid). Use the same error construction/helper
pattern used elsewhere in this module (the crate's Result/Error type) so callers
of create_user_if_unexisting and the run_as_user flow observe failure rather
than a silent warning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust/crates/rqd/src/system/windows.rs`:
- Around line 145-158: The current logic assumes logical_cpus % physical_cores
== 0 and computes hyperthreading_multiplier via integer division, which drops
extra logical CPUs on non-uniform SMT machines; update the calculation and/or
build_processor_structure to ensure every logical CPU is represented: compute
hyperthreading_multiplier as (logical_cpus + physical_cores - 1) /
physical_cores (ceiling) or change build_processor_structure(logical_cpus,
physical_cores, hyperthreading_multiplier) to iterate over 0..logical_cpus and
assign each logical CPU to a core (cores_by_phys_id / threads_by_core_unique_id)
so thread_id_lookup_table includes any remainder logical CPUs, and ensure
CoreStateManager sees the full logical_cpus capacity rather than truncating to
physical_cores * floor multiplier.
- Around line 101-137: The init function should normalize the configured
temp_path for Windows so read_temp_storage()/collect_stats() can match mount
points when MachineConfig::default leaves temp_path as "/tmp"; update
windows::init to detect Windows (identified_os or cfg!(windows)) and compute a
normalized_temp_path (convert "/tmp" to a Windows temp like from
std::env::temp_dir() or replace forward slashes/ensure a drive letter) then
store that normalized value into the runtime config used by read_temp_storage()
(e.g., set it on the cloned config passed into Self or add a
normalized_temp_path field on the returned struct); adjust references in
read_temp_storage() to use the normalized value so collect_stats() succeeds on
default Windows setups.
- Around line 348-366: collect_lineage currently always seeds root_pid causing
bogus single-item lineages for inactive roots; before pushing root_pid into the
initial stack, verify the root is active (e.g., check
self.session_processes.contains_key(&root_pid) or call the existing proc-active
check used elsewhere) and return an empty Vec if the root is not present/active;
apply the same root-presence check in get_proc_lineage and in
kill_session_internal so they treat inactive roots as None/failure instead of
fabricating a lineage—update the functions collect_lineage, get_proc_lineage,
and kill_session_internal to perform this pre-check and only build the
stack/lineage when the root is confirmed active.

---

Duplicate comments:
In `@rust/crates/rqd/src/system/windows.rs`:
- Around line 634-640: The reboot function currently treats any successful spawn
of Command::new("shutdown") as success; instead call .status(), inspect the
returned ExitStatus with ExitStatus::success(), and only return Ok(()) when
success is true; if not, convert the non-success status into an Err (using
into_diagnostic() / wrap_err with the status or code included) so callers see
that the shutdown command failed; update the logic in reboot that invokes
Command::new("shutdown").args(["/r","/t","0"]).status() to check
ExitStatus::success() and propagate a descriptive error when it is false.
- Around line 538-546: The function create_user_if_unexisting currently logs a
warning when the requested username differs from the current Windows account but
then returns Ok(uid), which falsely signals success; change the control flow so
that when current_user is non-empty and != username the function returns an Err
with a clear, descriptive error (e.g., "cannot honor run_as_user on Windows:
requested user 'X' differs from current user 'Y'") instead of Ok(uid). Use the
same error construction/helper pattern used elsewhere in this module (the
crate's Result/Error type) so callers of create_user_if_unexisting and the
run_as_user flow observe failure rather than a silent warning.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e2205572-0b6f-4fbb-8584-d7d8b216351d

📥 Commits

Reviewing files that changed from the base of the PR and between 3a85b56 and 8dcf79f.

📒 Files selected for processing (1)
  • rust/crates/rqd/src/system/windows.rs

Comment on lines +101 to +137
pub fn init(config: &MachineConfig, cpu_initial_info: ProcessorInfoData) -> Result<Self> {
let identified_os = config
.override_real_values
.clone()
.and_then(|c| c.os)
.unwrap_or_else(|| "windows".to_string());

// Initialize sysinfo collector
let sysinfo = sysinfo::System::new_with_specifics(
RefreshKind::nothing().with_memory(MemoryRefreshKind::everything()),
);
let total_memory = sysinfo.total_memory();
let total_swap = sysinfo.total_swap();

Ok(Self {
config: config.clone(),
static_info: MachineStaticInfo {
hostname: Self::get_hostname(config.use_ip_as_hostname)?,
total_memory,
total_swap,
num_sockets: cpu_initial_info.num_sockets,
cores_per_socket: cpu_initial_info.cores_per_socket,
hyperthreading_multiplier: cpu_initial_info.hyperthreading_multiplier,
boot_time_secs: sysinfo::System::boot_time(),
tags: Self::setup_tags(config),
},
hardware_state: HardwareState::Up,
attributes: HashMap::from([
("SP_OS".to_string(), identified_os),
(
"hyperthreadingMultiplier".to_string(),
cpu_initial_info.hyperthreading_multiplier.to_string(),
),
]),
sysinfo_system: Mutex::new(sysinfo::System::new()),
session_processes: DashMap::new(),
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize temp_path for Windows during init.

read_temp_storage() uses self.config.temp_path verbatim, but MachineConfig::default still sets that field to /tmp in rust/crates/rqd/src/config/mod.rs. On a default Windows setup this never matches any mount point, so collect_stats() fails until the path is overridden manually.

Suggested fix
 pub fn init(config: &MachineConfig, cpu_initial_info: ProcessorInfoData) -> Result<Self> {
+    let mut config = config.clone();
+    if config.temp_path.is_empty() || config.temp_path == "/tmp" {
+        config.temp_path = std::env::temp_dir().display().to_string();
+    }
+    let tags = Self::setup_tags(&config);
+
     let identified_os = config
         .override_real_values
         .clone()
         .and_then(|c| c.os)
         .unwrap_or_else(|| "windows".to_string());
@@
-                tags: Self::setup_tags(config),
+                tags,

Also applies to: 286-303

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 101 - 137, The init
function should normalize the configured temp_path for Windows so
read_temp_storage()/collect_stats() can match mount points when
MachineConfig::default leaves temp_path as "/tmp"; update windows::init to
detect Windows (identified_os or cfg!(windows)) and compute a
normalized_temp_path (convert "/tmp" to a Windows temp like from
std::env::temp_dir() or replace forward slashes/ensure a drive letter) then
store that normalized value into the runtime config used by read_temp_storage()
(e.g., set it on the cloned config passed into Self or add a
normalized_temp_path field on the returned struct); adjust references in
read_temp_storage() to use the normalized value so collect_stats() succeeds on
default Windows setups.

Comment on lines +145 to +158
let logical_cpus = sysinfo.cpus().len().max(1) as u32;
let physical_cores = sysinfo
.physical_core_count()
.unwrap_or(logical_cpus as usize)
.max(1) as u32;

let hyperthreading_multiplier = (logical_cpus / physical_cores).max(1);

let (threads_by_core_unique_id, cores_by_phys_id, thread_id_lookup_table) =
Self::build_processor_structure(
logical_cpus,
physical_cores,
hyperthreading_multiplier,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not drop logical CPUs on non-uniform SMT hosts.

This assumes logical_cpus % physical_cores == 0. On a 20-logical / 14-physical machine the floored multiplier becomes 1, so build_processor_structure() only emits thread IDs 0..13; the remaining logical CPUs never enter thread_id_lookup_table, and CoreStateManager sees less capacity than the host actually has.

Minimum fallback until real Win32 topology is wired in
 fn build_processor_structure(
     logical_cpus: u32,
     physical_cores: u32,
     hyperthreading_multiplier: u32,
 ) -> (
@@
+    let remainder = logical_cpus % physical_cores;
+    let mut next_thread_id = 0;
     let mut core_ids = Vec::with_capacity(physical_cores as usize);
     for core_id in 0..physical_cores {
         core_ids.push(core_id);
 
-        let start_thread = core_id * hyperthreading_multiplier;
         let mut threads = Vec::new();
-        for offset in 0..hyperthreading_multiplier {
-            let thread_id = start_thread + offset;
+        let threads_for_core = hyperthreading_multiplier + u32::from(core_id < remainder);
+        for _ in 0..threads_for_core {
+            let thread_id = next_thread_id;
+            next_thread_id += 1;
             if thread_id < logical_cpus {
                 threads.push(thread_id);
                 thread_id_lookup_table.insert(thread_id, (0, core_id));
             }
         }

Also applies to: 182-216

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 145 - 158, The current
logic assumes logical_cpus % physical_cores == 0 and computes
hyperthreading_multiplier via integer division, which drops extra logical CPUs
on non-uniform SMT machines; update the calculation and/or
build_processor_structure to ensure every logical CPU is represented: compute
hyperthreading_multiplier as (logical_cpus + physical_cores - 1) /
physical_cores (ceiling) or change build_processor_structure(logical_cpus,
physical_cores, hyperthreading_multiplier) to iterate over 0..logical_cpus and
assign each logical CPU to a core (cores_by_phys_id / threads_by_core_unique_id)
so thread_id_lookup_table includes any remainder logical CPUs, and ensure
CoreStateManager sees the full logical_cpus capacity rather than truncating to
physical_cores * floor multiplier.

Comment on lines +348 to +366
fn collect_lineage(&self, root_pid: u32) -> Vec<u32> {
let mut stack = vec![root_pid];
let mut visited = std::collections::HashSet::new();
let mut lineage = Vec::new();

while let Some(pid) = stack.pop() {
if !visited.insert(pid) {
continue;
}
lineage.push(pid);
if let Some(children) = self.session_processes.get(&pid) {
for child in children.iter().rev() {
stack.push(*child);
}
}
}

lineage
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not fabricate lineage for inactive roots.

collect_lineage() always seeds root_pid, so lineage.is_empty() can never trigger. That lets kill_session_internal() report success for an already-dead session, and get_proc_lineage() return Some([pid]) even though SystemManager::get_proc_lineage is supposed to return None when the root PID is inactive. Please refresh and verify the root process before building the lineage in both places.

Also applies to: 460-493, 625-631

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/crates/rqd/src/system/windows.rs` around lines 348 - 366,
collect_lineage currently always seeds root_pid causing bogus single-item
lineages for inactive roots; before pushing root_pid into the initial stack,
verify the root is active (e.g., check
self.session_processes.contains_key(&root_pid) or call the existing proc-active
check used elsewhere) and return an empty Vec if the root is not present/active;
apply the same root-presence check in get_proc_lineage and in
kill_session_internal so they treat inactive roots as None/failure instead of
fabricating a lineage—update the functions collect_lineage, get_proc_lineage,
and kill_session_internal to perform this pre-check and only build the
stack/lineage when the root is confirmed active.

@DiegoTavares DiegoTavares merged commit 5e5914b into AcademySoftwareFoundation:master Apr 2, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants