Skip to content

Commit 869cdfb

Browse files
committed
Phase 3: Add chunked directory loading
Performance for 50 files: ~350ms to first files, ~2.5s total! Backend: - Session cache with 60s expiry for directory listings - list_directory_start/next/end API for streaming chunks - First 5000 files returned instantly, rest from cache Frontend: - FilePane uses session API for progressive loading - Non-reactive array optimization (avoids 50k-item overhead) - requestAnimationFrame between chunks keeps UI responsive Also added docs about file loading.
1 parent 5a863e6 commit 869cdfb

20 files changed

Lines changed: 900 additions & 205 deletions

CONTRIBUTING.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ This snippet will likely come handy:
7979
}
8080
```
8181

82-
Since the agent shares the context with your IDE/client, enabling the MCP server makes the tools available to the agent automatically.
82+
Since the agent shares the context with your IDE/client, enabling the MCP server makes the tools available to the agent
83+
automatically.
8384

8485
Happy coding!

docs/adr/007-json-for-ipc.md

Lines changed: 47 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
# ADR 007: Use JSON for Tauri IPC, optimize with chunking
1+
# ADR 007: Use JSON for Tauri IPC
22

33
## Status
44

5-
Accepted
5+
Accepted and validated by benchmarks
66

77
## Context
88

@@ -11,42 +11,70 @@ Tauri 2.0 supports both JSON (default) and binary formats via Raw Payloads (Mess
1111

1212
### Options considered
1313

14-
1. **JSON (default)** - Simple, debuggable, no extra dependencies
15-
2. **MessagePack** - ~37% smaller, ~4x faster serialization
16-
3. **Protobuf** - Schema-based, very compact, complex setup
14+
1. **JSON (default)**: Simple, debuggable, no extra dependencies
15+
2. **MessagePack**: Smaller payload, but complex to integrate with Tauri
16+
3. **Protobuf**: Schema-based, very compact, complex setup
1717

18-
### Benchmarks (from research)
18+
### Benchmarks (actual measurements, Dec 2024)
1919

20-
For 50k file entries (~200 bytes/entry):
20+
We tested JSON vs. MessagePack with real directory listings:
2121

22-
- JSON: ~10MB payload, ~50ms serialization
23-
- MessagePack: ~6.3MB payload, ~12ms serialization
22+
| Files | JSON Time | JSON Size | MsgPack Time | MsgPack Size |
23+
| ----- | ---------- | --------- | ------------ | ------------ |
24+
| 5k | **454ms** | 1.69 MB | 718ms | 1.41 MB |
25+
| 50k | **4782ms** | 16.99 MB | 6432ms | 13.78 MB |
26+
27+
**Key finding: MessagePack is 34-58% SLOWER despite being 17-19% smaller.**
28+
29+
### Why binary formats are slower in Tauri
30+
31+
When returning `Vec<u8>` from a Tauri command, Tauri serializes it as a **JSON array of numbers**:
32+
33+
```
34+
[82, 117, 115, 116, 121, ...] // Each byte becomes 1-3 chars + comma
35+
```
36+
37+
This means:
38+
39+
1. Binary data is wrapped in JSON anyway (negating size benefits)
40+
2. JSON parsing is still required
41+
3. Then binary decoding adds more overhead
2442

2543
## Decision
2644

27-
Use **JSON** for IPC, combined with **chunking** (1000 entries per response).
45+
Use **JSON** for all Tauri IPC.
2846

2947
Rationale:
3048

31-
1. The chunking strategy reduces per-request payload to ~200KB regardless of format
49+
1. JSON is measurably faster than MessagePack/Protobuf through Tauri's invoke system
3250
2. JSON is simpler to debug (readable in browser devtools)
33-
3. No additional dependencies (`rmp-serde`, `@msgpack/msgpack`)
34-
4. Can always switch to MessagePack later if profiling shows bottleneck
51+
3. No additional dependencies needed
52+
4. Native `JSON.parse()` in JavaScript is heavily optimized
3553

3654
## Consequences
3755

3856
### Positive
3957

40-
- Simpler implementation, no binary serialization libraries
58+
- Best actual performance (benchmarked, not theoretical)
59+
- Simpler implementation
4160
- Easier debugging with browser devtools
4261
- Tauri's default path, best documented
4362

4463
### Negative
4564

46-
- Slightly larger payloads (~37% overhead vs MessagePack)
47-
- Slightly slower serialization (negligible with chunking)
65+
- Larger payloads than theoretical binary formats
66+
- For very large directories (50k+), IPC becomes the bottleneck
67+
68+
## Notes
69+
70+
### Alternative approaches to speed up large directory transfers
71+
72+
If IPC becomes a bottleneck for very large directories (100k+), consider:
4873

49-
### Notes
74+
1. **Chunked IPC** - Split into multiple 10k-item requests, process progressively
75+
2. **WebSocket sidecar** - Separate WebSocket server for raw binary transfer
76+
3. **Tauri Events with raw payloads** - Events can carry binary data differently than invoke
77+
4. **Virtual scrolling + lazy loading** - Only fetch visible items, load more on scroll
78+
5. **Reduce payload size** - Send fewer fields initially (name, type only), lazy-load metadata
5079

51-
If profiling reveals IPC as a bottleneck with 50k+ files after virtual scrolling and chunking are implemented, we can
52-
revisit this decision and switch to MessagePack using Tauri 2.0's `Response::new(bytes)` API.
80+
Current recommendation: Focus on virtual scrolling and chunked loading rather than binary formats.

docs/features/file-loading.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# File loading
2+
3+
How directory listings are loaded, from user action to rendered list.
4+
5+
## Overview
6+
7+
When a user navigates to a directory, the app:
8+
9+
1. Reads the directory contents from disk (Rust)
10+
2. Transfers the data to the frontend (Tauri IPC with JSON)
11+
3. Renders the file list progressively (Svelte)
12+
13+
For large directories (50k+ files), this uses **cursor-based pagination** to show the first chunk immediately while
14+
loading the rest in the background.
15+
16+
## Architecture diagram
17+
18+
```mermaid
19+
sequenceDiagram
20+
participant User
21+
participant FilePane
22+
participant TauriIPC
23+
participant RustBackend
24+
participant FileSystem
25+
26+
User->>FilePane: Navigate to directory
27+
FilePane->>TauriIPC: listDirectoryStartSession(path, 5000)
28+
TauriIPC->>RustBackend: list_directory_start_session
29+
RustBackend->>FileSystem: Read directory
30+
FileSystem-->>RustBackend: All entries
31+
RustBackend->>RustBackend: Sort, cache in session
32+
RustBackend-->>TauriIPC: First 5000 entries + sessionId
33+
TauriIPC-->>FilePane: JSON response
34+
FilePane->>User: Render first chunk immediately
35+
36+
loop While hasMore
37+
FilePane->>TauriIPC: listDirectoryNextChunk(sessionId, 5000)
38+
TauriIPC->>RustBackend: list_directory_next_chunk
39+
RustBackend-->>TauriIPC: Next chunk from cache
40+
TauriIPC-->>FilePane: JSON response
41+
FilePane->>User: Append entries
42+
end
43+
44+
FilePane->>TauriIPC: listDirectoryEndSession(sessionId)
45+
```
46+
47+
## Data flow layers
48+
49+
### 1. Frontend: FilePane.svelte
50+
51+
The
52+
[FilePane](file:///Users/veszelovszki/Library/CloudStorage/Dropbox/projects-git/vdavid/rusty-commander/src/lib/file-explorer/FilePane.svelte)
53+
component orchestrates directory loading.
54+
55+
**Key function:** `loadDirectory(path, selectName?)`
56+
57+
1. Shows loading state
58+
2. Calls `listDirectoryStartSession()` to get first chunk
59+
3. Renders first chunk immediately
60+
4. Calls `listDirectoryNextChunk()` in a loop for remaining data
61+
5. Uses `requestAnimationFrame()` between chunks to keep UI responsive
62+
6. Calls `listDirectoryEndSession()` to clean up
63+
64+
**Reactivity optimization:** The file list is stored in a plain array (`allFilesRaw`) rather than Svelte's `$state` to
65+
avoid the overhead of making 50k objects reactive. A simple counter (`filesVersion`) triggers updates.
66+
67+
### 2. IPC layer: tauri-commands.ts
68+
69+
The
70+
[tauri-commands](file:///Users/veszelovszki/Library/CloudStorage/Dropbox/projects-git/vdavid/rusty-commander/src/lib/tauri-commands.ts)
71+
module provides typed wrappers for Rust commands.
72+
73+
**Session API functions:**
74+
75+
- `listDirectoryStartSession(path, chunkSize)``SessionStartResult`
76+
- `listDirectoryNextChunk(sessionId, chunkSize)``ChunkNextResult`
77+
- `listDirectoryEndSession(sessionId)` → void
78+
79+
For serialization format rationale, see
80+
[ADR 007: Use JSON for Tauri IPC](file:///Users/veszelovszki/Library/CloudStorage/Dropbox/projects-git/vdavid/rusty-commander/docs/adr/007-json-for-ipc.md).
81+
82+
### 3. Rust commands: commands/file_system.rs
83+
84+
The
85+
[file_system commands](file:///Users/veszelovszki/Library/CloudStorage/Dropbox/projects-git/vdavid/rusty-commander/src-tauri/src/commands/file_system.rs)
86+
expose Tauri commands that call the file system operations.
87+
88+
**Commands:**
89+
90+
- `list_directory_start_session` - Starts a session, reads directory, caches entries, returns first chunk
91+
- `list_directory_next_chunk` - Returns next chunk from cache
92+
- `list_directory_end_session` - Cleans up the session cache
93+
94+
### 4. File system operations: file_system/operations.rs
95+
96+
The
97+
[operations module](file:///Users/veszelovszki/Library/CloudStorage/Dropbox/projects-git/vdavid/rusty-commander/src-tauri/src/file_system/operations.rs)
98+
contains the core logic.
99+
100+
**Session cache:** A static `HashMap<String, CachedDirectory>` stores directory listings keyed by session ID. Sessions
101+
expire after 60 seconds to prevent memory leaks.
102+
103+
**Key function:** `list_directory(path)`
104+
105+
1. Reads directory entries with `fs::read_dir()`
106+
2. Extracts metadata (size, permissions, timestamps)
107+
3. Resolves owner/group names (with caching)
108+
4. Generates icon IDs
109+
5. Sorts: directories first, then files, both alphabetically
110+
111+
## Latency breakdown (50k files)
112+
113+
Based on benchmarks on a MacBook Pro M1:
114+
115+
| Step | Time | Notes |
116+
| ----------------------- | ----------- | --------------------------------- |
117+
| Rust `list_directory()` | ~300ms | Disk I/O + metadata extraction |
118+
| JSON serialization | ~18ms | 17 MB payload |
119+
| IPC transfer | ~1.4s | WebView JSON parsing |
120+
| Svelte reactivity | ~50ms | With optimized non-reactive array |
121+
| **First chunk visible** | **~350ms** | User sees files quickly |
122+
| **Full list loaded** | **~2-2.5s** | Competitive with Commander One |
123+
124+
## Configuration
125+
126+
**Chunk size:** 5000 entries (defined in `FilePane.svelte`)
127+
128+
This balances:
129+
130+
- Time to first content (smaller = faster)
131+
- Number of IPC calls (larger = fewer calls)
132+
- Memory overhead (larger = more cached)
133+
134+
## Key design decisions
135+
136+
1. **JSON over MessagePack** - Native JSON is faster through Tauri's IPC. See
137+
[ADR 007](file:///Users/veszelovszki/Library/CloudStorage/Dropbox/projects-git/vdavid/rusty-commander/docs/adr/007-json-for-ipc.md).
138+
139+
2. **Session-based caching** - Directory is read once, chunks served from memory. Avoids O(n²) re-reading.
140+
141+
3. **Non-reactive file array** - Svelte's `$state` on 50k objects caused ~9.5s overhead. Using a plain array with manual
142+
reactivity trigger reduced this to ~50ms.
143+
144+
4. **Progressive rendering** - First chunk appears in ~350ms. Remaining chunks load without blocking the UI.
145+
146+
## Future improvements
147+
148+
- **Virtual scrolling** - Only render visible rows (phase 4)
149+
- **Lazy metadata loading** - Load only names first, fetch metadata on demand
150+
- **File system watcher** - Auto-refresh on external changes
151+
- **Cancellation** - Cancel in-progress directory reads when navigating away

0 commit comments

Comments
 (0)