Skip to content

Rust: Tee Grate for FS archiving#52

Open
rennergade wants to merge 4 commits intomainfrom
fs-tee-grate
Open

Rust: Tee Grate for FS archiving#52
rennergade wants to merge 4 commits intomainfrom
fs-tee-grate

Conversation

@rennergade
Copy link
Copy Markdown
Contributor

  • Implements the tee grate, which duplicates every intercepted syscall across two independent handler chains (primary and secondary)
  • Primary's return value is authoritative; secondary is best-effort (errors logged to stderr, never propagated)
  • Uses the same register_handler interposition mechanism as the namespace grate to capture handler registrations from both stacks
  • Uses fdtables for fd lifecycle tracking across fork/exec/exit

How it works

Interposition: The tee grate registers handlers for register_handler (1001), exec (59), fork (57), and exit (60) on every managed cage. When a grate in either the primary or secondary stack calls register_handler, the tee intercepts it, allocates an alt syscall number, and installs its own dispatch handler on the target cage.

Dispatch: When the app makes a syscall, the tee handler fires. It calls the primary handler first (via its alt syscall number), then calls the secondary handler best-effort. The primary's return value is always what gets returned to the caller.

Primary/secondary assignment: Auto-detected by order — the first grate_id seen in a register_handler call becomes primary, the second becomes secondary.

Primary-only syscalls: fork, clone, exec, exit are never duplicated — they have process-level side effects that would break if executed twice.

fdtables integration: open records new fds, close removes them, dup/dup2/dup3 copies entries, fork clones the fd table, exec closes cloexec fds, exit removes the cage's fd table.

CLI

tee-grate --primary imfs-grate --secondary archive-grate -- python build_app.py
Or inline:
tee-grate %{ archive-grate %} python

Files

examples/tee-grate/
Cargo.toml - git deps for grate-rs, fdtables, libc
src/
main.rs - CLI parsing, lifecycle handlers, dispatch handlers, fork/exec/wait
tee.rs - TeeState, route table, tee_dispatch(), do_syscall(), unit tests

@stupendoussuperpowers
Copy link
Copy Markdown
Contributor

We need to have a thorough discussion on how tee-ing is supposed to work both architecturally and in code. I am going to explain the issues seen in the current approach, and also an alternate approach I tried. The structure of this grate is messy which makes it hard to describe these issues with a 100% clarity so follow up questions for clarity are appreciated.


The current approach

If we want to tee, we invoke:

lind-boot tee-grate %{ open-alt-grate %} open-simple-grate test

The issue with this is that the tee-grate is now interposing on open-simple-grate and not test.

test calls open() -> goes to open-simple-grate -> open-simple-grate never calls open() -> flow terminates here, skipping the alternate path and the tee-grate entirely.

Possible Fix

As a fix for this, I envisioned something like:

lind-boot tee-grate %{ open-simple-grate %} %{ open-alt-grate %} test

Here, I had to add support for both "%{" (shift + set interposing = true) and "%}" (shift + set interposing = false).

When the grates in this path don't themselves invoke open() this works fine. But consider:

lind-boot tee-grate %{ open-simple-grate %} %{ strace-grate %} test

The handler constructed for this is convoluted beyond any meaning. Walking through what happens:

lind@e6bbbe8b13e7:~/lind-wasm-example-grates/examples/tee-grate$ sudo lind-boot tee-grate.cwasm %{ open-simple-grate.cwasm %} %{ open-alt-grate.cwasm %} test.cwasm

[tee-grate] exec_chain=["%{", "open-simple-grate.cwasm", "%}", "%{", "open-alt-grate.cwasm", "%}", "test.cwasm"], buffer_limit=65536
[tee-grate] forked child cage 2 (tee_cage=1)
[tee-grate] %{ detected. starting interception
shift-args "open-simple-grate.cwasm"
[tee-grate] execing real program: "open-simple-grate.cwasm"
[grate|open] Registering open handler for cage 3 in grate 2 with fn ptr addr: 3
[tee-grate] intercept register_handler: cage=3, syscall=2, grate=2
[tee-grate] %} detected. stopping interception
shift-args "%{"
[tee-grate] %{ detected. starting interception
shift-args "open-alt-grate.cwasm"
[tee-grate] execing real program: "open-alt-grate.cwasm"
[grate|open-2] Registering open handler for cage 4 in grate 3 with fn ptr addr: 3
[tee-grate] intercept register_handler: cage=4, syscall=2, grate=3
[tee-grate] %} detected. stopping interception
shift-args "test.cwasm"
[tee-grate] execing real program: "test.cwasm"

open-simple-grate (cageid: 2) calls register_handler on cageid: 3. This call is intercepted by tee-grate. The next two tokens in the exec chain are %{ and %} these get skipped. strace-grate now runs in cageid: 3. Anytime strace-grate calls open(), this will go to the tee-grate.

test calls open() -> hits tee-grate handler -> tee-grate calls open-simple-grate and strace-grate -> open-simple-grate resolves -> strace-grate calls open() -> call hits the tee-grate -> tee-grate invokes open-simple-grate and strace-grate again -> ... -> stack overflow


The issue here is that we expect these unrelated grates to act as a parent and child to other grates that have nothing to do with each other. While intercepting on register_handler helps ensure that these parent/child pairing don't mean anything to a certain extent, it's still hard to avoid issues like this.

For instance, when open-simple-grate calls register_handler, semantically it wants to interpose on the actual children chain, but there is no way it can indicate that and there is no way for the tee-grate to even know what that child ID is supposed to be.

We need to come up with a better way to implement this state tracking and avoid random grate chains being parents to other random grate chains.

One initial thought I had was:

tee-grate calls three parallel processes and its direct child, in this example open-simple-grate, strace-grate, and test. We let open-simple-grate call register_handler on it's child process (execing %}), same with strace-grate. We don't invoke register_handler yet, we just store it in some sort of a table with a placeholder value for the target cage.

Once everything is done and test is running, tee-grate replaces the placeholder with actual child PID and calls register_handler for everything.

@stupendoussuperpowers
Copy link
Copy Markdown
Contributor

CC: @rennergade @Yaxuan-w

@JustinCappos
Copy link
Copy Markdown
Member

As a fix for this, I envisioned something like:

lind-boot tee-grate %{ open-simple-grate %} %{ open-alt-grate %} test

This is fine / what I envisioned would happen.

I think the fact you seem to have open-simple-grate able to interpose on open-alt-grate is a problem here. What should happen is that open-simple-grate should see "%} test" as its command line. Once it execs "%}" with "test" as an argument, the tee-grate then goes to fork / exec "open-alt-grate" with "%} test" as arguments and the tee grate's handles as this process's handlers. (You could conceivably fork both processes at once to make it easier to manage the handler tables.) Once the open-alt-grate forks and execs "%}" with an argument "test", you're ready to have the tee-grate go to the next step.

Now the tee-grate has two children, which will have registered handlers for at least fork, if not more. You need to have tracked those in your handler table. You exec "test" with a handler table that points to your own implementation of any call that either stack has interposed on. You call those handlers, allowing the call stacks for most to 'go through' except things that are destructive and non-idempotent (fork, exec, exit, etc.).

I may be misunderstanding some complexities here if this doesn't seem it will work.

Happy to discuss in person or on slack.

@stupendoussuperpowers
Copy link
Copy Markdown
Contributor

I think this approach makes sense, and falls into a similar space where I thought it'd go which is to do some argv[] manipulation instead of simply shifting and passing down all the remaining arguments down the line.

I'll try to implement this and share more on the feasibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants