Feat/performance benchmarks#134
Conversation
WalkthroughAdds a comparative benchmarking suite (Criterion) and visualization for NexumDB vs SQLite, updates benchmark docs and Cargo.toml, introduces new benchmark files and a Python visualizer, makes the storage engine module public, and adds two executor tests verifying DROP behavior and physical deletion. Changes
Sequence Diagram(s)sequenceDiagram
participant Bench as Benchmark (Criterion)
participant SQLite as SQLite DB
participant Nexum as Nexum StorageEngine
participant Exec as Nexum Executor
Bench->>SQLite: setup_sqlite() create table & insert rows
Bench->>Nexum: setup_nexum() init StorageEngine + Executor
Bench->>SQLite: run prepared point lookup loop
Bench->>Exec: parse SQL -> Exec.execute(select_stmt) (cold)
Bench->>Exec: warm semantic cache
Bench->>Exec: Exec.execute(select_stmt) (cached loop)
Exec->>Nexum: storage read operations
SQLite->>Bench: return rows / timings
Nexum->>Bench: return rows / timings
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
🤖 Fix all issues with AI agents
In `@nexum_core/benches/db_comparison.rs`:
- Around line 77-88: The benchmarks label a "Cold" vs "Cached" run but
setup_nexum() constructs an Executor without enabling the semantic cache, so the
"Cached" path (the extra warmup execute on nexum_executor with select_stmt) has
no effect; either enable the cache by calling with_cache(...) on the Executor
returned by setup_nexum() (so that nexum_executor.with_cache(...) /
setup_nexum() config includes the caching layer) before running the
"NexumDB_Point_Lookup_Cached" benchmark, or remove/rename the cached benchmark
and its warmup execute to avoid the misleading distinction between
NexumDB_Point_Lookup_Cold and NexumDB_Point_Lookup_Cached.
- Around line 39-46: The NexumDB benchmark reuses a parsed INSERT with a fixed
id, making it incomparable to SQLite; update the bench closure in
group.bench_function("NexumDB_Single_Insert", using setup_nexum and
executor.execute, to generate a unique id each iteration (e.g., incrementing
counter or UUID) and build/parse the INSERT inside b.iter so each iteration
inserts a distinct id; ensure you stop cloning a single Parser::parse(statement)
and instead parse/prepare the per-iteration SQL (or at minimum substitute the
generated id into a pre-parsed template) before calling executor.execute to
match SQLite's behavior.
- Around line 70-74: The benchmark currently calls sqlite_conn.prepare(...)
inside group.bench_function's closure, causing prepare() to be measured; move
the prepare call out of the iteration so the prepared statement is created once
(e.g., before invoking b.iter) and then inside the iteration only call
stmt.query_row(...) (the prepared statement variable `stmt` returned by
sqlite_conn.prepare and the use of `query_row`) so the benchmark measures only
query execution latency.
- Around line 16-27: The TempDir created in setup_nexum is being dropped at
function exit causing the storage directory to be deleted while
StorageEngine/Executor remain live; change the signature of setup_nexum to
return both the Executor and the TempDir (e.g., -> (Executor, TempDir)), keep
the db_path variable in scope by returning it alongside the Executor, and update
any callers to accept the tuple so TempDir stays alive for the duration of the
benchmark; reference the setup_nexum function, the db_path TempDir,
StorageEngine::new and Executor::new when making this change.
- Around line 9-14: The setup_sqlite function creates a NamedTempFile that is
dropped at return, causing the returned Connection to point to a deleted file
(problematic on Windows); modify setup_sqlite to keep the temp file alive by
returning the NamedTempFile alongside the rusqlite::Connection (or alternatively
create a tempdir and build the DB path inside it), update call sites to
accept/destructure the tuple (e.g., let (conn, _db_file) = setup_sqlite()) so
the NamedTempFile remains in scope while Connection is used.
- Around line 60-68: The NexumDB insert loop uses a constant INSERT with id=1,
so the SELECT WHERE id = 500 matches nothing; change the insert logic to insert
unique, incrementing ids that mirror SQLite (e.g., use the loop index) so ids
span the same range as row_count. Concretely, replace the constant
insert_sql/insert_stmt usage with an INSERT that injects the current loop index
(or prepare a parameterized statement and bind i) inside the for 0..row_count
loop where nexum_executor.execute(...) is called, keeping the
Parser::parse/execute flow (or use a prepared statement) so nexum_executor,
insert_stmt, Parser::parse, and row_count align with SQLite's 0..row_count id
sequence.
In `@nexum_core/benches/README.md`:
- Around line 29-32: The README's claim that PyO3 bridge and semantic caching
contribute to the ~1.8ms latency is incorrect because the benchmark executor is
created without .with_cache(), so those layers are never invoked; update the
README text (the paragraph discussing "PyO3 Bridge" and "Semantic Caching") to
remove or reword claims about PyO3/semantic-caching overhead and instead state
that the current benchmark measures SQL parsing + sled scan only, and add a note
that accurate measurement of PyO3/semantic-caching requires creating the
executor with .with_cache() and re-running the benchmarks.
- Around line 10-14: The benchmark SELECT numbers are invalid because
db_comparison.rs inserts NexumDB rows all with id=1 then queries WHERE id = 500
(zero matches) and labels Cold/Cached without enabling a cache; fix
db_comparison.rs by generating unique IDs for inserted rows (e.g., use the loop
index or a proper primary key) and make the SELECT query target an existing id
(or benchmark point selects by key present in the dataset), remove or correctly
implement the caching scenario (enable the cache before the Cached run or drop
the Cached/Cold distinction), then re-run the benchmarks and update the README
table in nexum_core/benches/README.md with the corrected timings.
- Around line 6-48: The Markdown has lint issues: ensure every heading (e.g.,
"## Performance & Benchmarks (Feb 2026)", "### Comparative Performance Results",
"### Analysis", "### Architectural Insights", "#### 1. Write Throughput: The
LSM-Tree Advantage", "#### 2. Read Latency & AI Overhead", and "##
Architecture") is surrounded by a blank line above and below, add a blank line
before and after the table starting with "| Operation | SQLite | NexumDB | Delta
|" and after the table block, and remove the trailing space on the line in the
"Read Latency & AI Overhead" paragraph; reflow nearby lines so code/inline
elements (like `criterion`, `sled`, `Statement`, `PyO3`) remain properly fenced
and the image link ("") is also separated by
blank lines.
- Line 17: The image reference in the README uses an incorrect nested path;
update the markdown image tag on the line showing "" to point to the local
file name (e.g., "") so the link
resolves correctly from the README in nexum_core/benches; change only the path
in that image tag.
- Around line 5-6: Remove the orphaned heading "## Benchmark Categories" that is
immediately followed by "## Performance & Benchmarks (Feb 2026)" in the README;
either delete the "## Benchmark Categories" line or merge its intent into "##
Performance & Benchmarks (Feb 2026)" and ensure a blank line before the
remaining heading to satisfy markdownlint MD022. Locate the headings by their
exact text ("## Benchmark Categories" and "## Performance & Benchmarks (Feb
2026)"), remove or replace the former, and add a single blank line above the
retained heading.
In `@nexum_core/benches/storage_bench.rs`:
- Around line 1-10: Remove the unnecessary blank lines in the imports and before
the function declaration in this file: collapse the extra blank line after "use
std::time::Duration;", remove the blank line(s) between the criterion use block
and "use nexum_core::StorageEngine;", and eliminate the blank line(s)
immediately before the "fn storage_write_throughput(c: &mut Criterion) {"
declaration so imports and the function are grouped consistently.
In `@nexum_core/benches/visualize.py`:
- Line 23: The line containing ax.set_yscale('log') in visualize.py has a
trailing whitespace; remove the extra space after the closing parenthesis so the
call is exactly ax.set_yscale('log') to clean up the trailing whitespace.
- Around line 4-7: The benchmark visualization currently uses hard-coded arrays
(labels, sqlite_times, nexum_times) in nexum_core/benches/visualize.py which
will become stale; update visualize.py to load benchmark data dynamically by
either parsing Criterion's JSON output (e.g.,
target/criterion/**/new/estimates.json) to populate labels and times or by
accepting those values as CLI arguments, and replace the static lists with the
parsed/CLI-provided values so functions that generate the chart use the dynamic
data sources instead of the hard-coded variables.
- Around line 1-27: The script runs plotting code at import time; wrap the body
into a typed main function and add an entrypoint guard: create def main() ->
None that declares typed variables (e.g., labels: list[str], sqlite_times:
list[float], nexum_times: list[float], x: np.ndarray, width: float, fig:
plt.Figure, ax: plt.Axes) and moves the plotting and
plt.savefig('bench_results.png') into it, then add if __name__ == "__main__":
main() to prevent side effects on import and satisfy PEP 8/type-hint
requirements.
In `@nexum_core/Cargo.toml`:
- Line 26: Update the rusqlite dependency in Cargo.toml from "0.31.0" to the
latest "0.38.0" (replace the rusqlite = "0.31.0" line with rusqlite = "0.38.0"),
then run cargo update -p rusqlite and cargo build/test to surface any breaking
changes; if the build fails, inspect and update usages of rusqlite APIs (e.g.,
Connection, params!, named_params!, Transaction methods) to match the 0.38.0
API.
In `@nexum_core/src/executor/mod.rs`:
- Line 830: Replace the non-descriptive inline comment "// second test" with a
meaningful description (or remove it) near the related test in
nexum_core::executor::mod.rs; for example, replace it with a concise phrase like
"// Verify that DROP TABLE physically removes row data from storage" so the test
intent is clear, or delete the comment if it adds no value.
- Line 815: Update the comment on the test in mod.rs to follow idiomatic Rust
comment style by adding a leading space and capitalizing the first word; change
the existing `//added a test that checks if dropping a non-existent table
without IF EXISTS fails correctly.` to `// Added a test that checks if dropping
a non-existent table without IF EXISTS fails correctly.` — locate and edit the
inline comment near the test (around the test that verifies dropping a
non-existent table) in nexum_core::executor::mod.rs.
In `@nexum_core/src/storage/mod.rs`:
- Around line 1-5: The engine module is currently made public but its only
exported item is already re-exported as StorageEngine; change the module
declaration from `pub mod engine;` to `mod engine;` to keep the internal module
private while retaining the public API via `pub use engine::StorageEngine;`, and
ensure no other code refers to `storage::engine::` directly (update any such
references to use `StorageEngine` through the re-export).
| [dev-dependencies] | ||
| tempfile = "3.24" | ||
| criterion = { version = "0.5", features = ["html_reports"] } | ||
| rusqlite = "0.31.0" # To interface with SQLite for comparison |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
rusqlite crate latest version crates.io
💡 Result:
The latest rusqlite version on crates.io is 0.38.0. [1][2]
Cargo.toml:
rusqlite = "0.38.0"Sources: [1][2]
🌐 Web query:
rusqlite 0.31.0 security advisory vulnerability
💡 Result:
No specific security advisory appears to target rusqlite 0.31.0. The two main RustSec advisories for rusqlite list affected versions older than 0.31.0, meaning 0.31.0 is not in their vulnerable ranges:
- RUSTSEC-2020-0014 (various memory safety issues): affects rusqlite < 0.23.0, patched in >= 0.23.0. [1]
- RUSTSEC-2021-0128 (incorrect lifetime bounds / use-after-free across several APIs): affects >= 0.25.0 and < 0.25.4, and >= 0.26.0 and < 0.26.2, patched in 0.25.4 / 0.26.2 (and later). [2]
For context, rusqlite 0.31.0 is a later release line than those fixed versions. [3]
Sources: [1] [2] [3]
🏁 Script executed:
# Check if there are any known rusqlite vulnerabilities in the Rust advisory database
# by searching for related crate versions
rg -i "rusqlite" --type toml . 2>/dev/null | head -20Repository: aviralgarg05/NexumDB
Length of output: 206
Update rusqlite to the latest version.
rusqlite 0.31.0 is free of known security advisories, but version 0.38.0 is the current latest release. Consider updating to benefit from bug fixes and improvements across 7 minor versions.
🤖 Prompt for AI Agents
In `@nexum_core/Cargo.toml` at line 26, Update the rusqlite dependency in
Cargo.toml from "0.31.0" to the latest "0.38.0" (replace the rusqlite = "0.31.0"
line with rusqlite = "0.38.0"), then run cargo update -p rusqlite and cargo
build/test to surface any breaking changes; if the build fails, inspect and
update usages of rusqlite APIs (e.g., Connection, params!, named_params!,
Transaction methods) to match the 0.38.0 API.
| use super::*; | ||
| use crate::sql::types::{Column, DataType, SelectItem}; | ||
|
|
||
| //added a test that checks if dropping a non-existent table without IF EXISTS fails correctly. |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Comment style nit: missing space after //.
//added a test should be // Added a test for idiomatic Rust comment style.
- //added a test that checks if dropping a non-existent table without IF EXISTS fails correctly.
+ // Test that dropping a non-existent table without IF EXISTS fails correctly.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| //added a test that checks if dropping a non-existent table without IF EXISTS fails correctly. | |
| // Test that dropping a non-existent table without IF EXISTS fails correctly. |
🤖 Prompt for AI Agents
In `@nexum_core/src/executor/mod.rs` at line 815, Update the comment on the test
in mod.rs to follow idiomatic Rust comment style by adding a leading space and
capitalizing the first word; change the existing `//added a test that checks if
dropping a non-existent table without IF EXISTS fails correctly.` to `// Added a
test that checks if dropping a non-existent table without IF EXISTS fails
correctly.` — locate and edit the inline comment near the test (around the test
that verifies dropping a non-existent table) in nexum_core::executor::mod.rs.
| assert!(result.is_err(), "Dropping a non-existent table should return an error"); | ||
| } | ||
|
|
||
| // second test |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Non-descriptive comment // second test.
This comment doesn't add value. Either remove it or replace it with something meaningful (e.g., // Verify that DROP TABLE physically removes row data from storage).
- // second test
+ // Verify that DROP TABLE physically removes row data from storage.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // second test | |
| // Verify that DROP TABLE physically removes row data from storage. |
🤖 Prompt for AI Agents
In `@nexum_core/src/executor/mod.rs` at line 830, Replace the non-descriptive
inline comment "// second test" with a meaningful description (or remove it)
near the related test in nexum_core::executor::mod.rs; for example, replace it
with a concise phrase like "// Verify that DROP TABLE physically removes row
data from storage" so the test intent is clear, or delete the comment if it adds
no value.
| pub mod engine; | ||
| mod error; | ||
|
|
||
| pub use engine::StorageEngine; | ||
| // Re-exporting StorageEngine as the primary interface | ||
| pub use engine::StorageEngine; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if anything accesses nexum_core::storage::engine:: directly
rg -n 'storage::engine::' --type rustRepository: aviralgarg05/NexumDB
Length of output: 46
🏁 Script executed:
#!/bin/bash
# Also check what public items are in engine.rs
fd engine.rs --type f --exec head -50 {} \;Repository: aviralgarg05/NexumDB
Length of output: 1380
🏁 Script executed:
#!/bin/bash
# Check the current mod.rs file to see the full context
cat nexum_core/src/storage/mod.rsRepository: aviralgarg05/NexumDB
Length of output: 288
pub mod engine unnecessarily exposes the internal module when the re-export already provides full access.
The codebase contains no direct accesses to storage::engine::*. All usage goes through the re-export pub use engine::StorageEngine;, which is sufficient. Since engine.rs contains only the StorageEngine struct, making the module public widens the API surface without providing additional functionality.
Consider reverting to mod engine; to maintain cleaner encapsulation and a more explicit public API surface.
♻️ Keep module private, rely on re-export
-pub mod engine;
+mod engine;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub mod engine; | |
| mod error; | |
| pub use engine::StorageEngine; | |
| // Re-exporting StorageEngine as the primary interface | |
| pub use engine::StorageEngine; | |
| mod engine; | |
| mod error; | |
| // Re-exporting StorageEngine as the primary interface | |
| pub use engine::StorageEngine; |
🤖 Prompt for AI Agents
In `@nexum_core/src/storage/mod.rs` around lines 1 - 5, The engine module is
currently made public but its only exported item is already re-exported as
StorageEngine; change the module declaration from `pub mod engine;` to `mod
engine;` to keep the internal module private while retaining the public API via
`pub use engine::StorageEngine;`, and ensure no other code refers to
`storage::engine::` directly (update any such references to use `StorageEngine`
through the re-export).
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@nexum_core/benches/db_comparison.rs`:
- Around line 56-64: The SQLite benchmark discards the query_row result (let _)
which can be optimized away; wrap the retrieved value in criterion::black_box
inside the group.bench_function closure to prevent dead-code elimination (i.e.,
pass the result of stmt.query_row(... ) into black_box). Ensure
criterion::black_box is in scope (import if necessary) and update the closure
around stmt.query_row in the "SQLite_Point_Lookup" benchmark to use black_box
instead of ignoring the value.
- Around line 88-89: Add full CRUD benchmarks and register them: implement new
benchmark functions named bench_inserts, bench_updates, and bench_deletes
(matching the existing bench_selects pattern) that perform the same operations
against NexumDB and SQLite (use the INSERT statements already in the file for
setup as a guide), measure throughput with Criterion, and ensure each test
cleans up or isolates data between runs; then update the criterion_group!
invocation to criterion_group!(benches, bench_selects, bench_inserts,
bench_updates, bench_deletes) so all four benchmarks are run and comparable to
the README results.
In `@nexum_core/benches/visualize.py`:
- Around line 1-6: Remove the unused sys import from the top-level imports:
delete the reference to sys in the import block alongside json, argparse, Path,
typing, and matplotlib so the file no longer imports an unused module (ensure
only the remaining imports json, argparse, Path, Dict/List/Optional typing, and
matplotlib.pyplot as plt remain).
- Around line 17-30: The find_benchmark_data function currently uses
base_path.rglob("estimates.json") which can return both base/estimates.json and
new/estimates.json and leads to non-deterministic overwrites; update the loop to
only accept the latest run by filtering json_path to ensure it is the
"new/estimates.json" file (e.g., check json_path.name == "estimates.json" and
json_path.parent.name == "new" or match the suffix "/new/estimates.json") before
calling parse_criterion_json and assigning results[target] = val so only
latest-run values are collected.
| // 1. SQLite Baseline | ||
| group.bench_function("SQLite_Point_Lookup", |b| { | ||
| // PREPARE OUTSIDE: Compiles lookup once | ||
| let mut stmt = sqlite_conn.prepare("SELECT val FROM bench WHERE id = 500").unwrap(); | ||
| b.iter(|| { | ||
| // Explicitly typed |r: &rusqlite::Row| to fix E0282 | ||
| let _ = stmt.query_row([], |r: &rusqlite::Row| r.get::<_, String>(0)).unwrap(); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Wrap the SQLite query result with black_box to prevent dead-code elimination.
The NexumDB benchmarks correctly use black_box(nexum_executor.execute(...)), but the SQLite benchmark discards the result with let _, which the compiler may optimize away entirely. This creates an uneven comparison.
🐛 Proposed fix
b.iter(|| {
- let _ = stmt.query_row([], |r: &rusqlite::Row| r.get::<_, String>(0)).unwrap();
+ black_box(stmt.query_row([], |r: &rusqlite::Row| r.get::<_, String>(0)).unwrap());
});🤖 Prompt for AI Agents
In `@nexum_core/benches/db_comparison.rs` around lines 56 - 64, The SQLite
benchmark discards the query_row result (let _) which can be optimized away;
wrap the retrieved value in criterion::black_box inside the group.bench_function
closure to prevent dead-code elimination (i.e., pass the result of
stmt.query_row(... ) into black_box). Ensure criterion::black_box is in scope
(import if necessary) and update the closure around stmt.query_row in the
"SQLite_Point_Lookup" benchmark to use black_box instead of ignoring the value.
| criterion_group!(benches, bench_selects); | ||
| criterion_main!(benches); No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's see the full content of the db_comparison.rs file to understand what benchmarks are defined
cat -n nexum_core/benches/db_comparison.rsRepository: aviralgarg05/NexumDB
Length of output: 4260
🏁 Script executed:
# Search for INSERT benchmarks in the codebase
rg -rn 'bench_insert\|Insert_Performance\|insert.*bench' --type rustRepository: aviralgarg05/NexumDB
Length of output: 46
🏁 Script executed:
# Search more broadly for all benchmark files
find nexum_core/benches -type f -name "*.rs"Repository: aviralgarg05/NexumDB
Length of output: 239
🏁 Script executed:
# Search for any bench_insert, bench_update, bench_delete functions anywhere
rg 'fn bench_insert|fn bench_update|fn bench_delete|fn bench_inserts|fn bench_updates|fn bench_deletes' --type rustRepository: aviralgarg05/NexumDB
Length of output: 46
🏁 Script executed:
# Check if there are any other benchmark-related files or functions
git ls-files nexum_core/benches/Repository: aviralgarg05/NexumDB
Length of output: 341
🏁 Script executed:
# Search in all benchmark files for INSERT, UPDATE, DELETE related benchmarks
rg -n 'INSERT|UPDATE|DELETE|insert|update|delete' nexum_core/benches/ --type rust -C 2Repository: aviralgarg05/NexumDB
Length of output: 8381
🏁 Script executed:
# Check the PR description or issue context - look for references to INSERT, UPDATE, DELETE
cat nexum_core/benches/README.mdRepository: aviralgarg05/NexumDB
Length of output: 8867
🏁 Script executed:
# Check all benchmark files to see what functions they define
for file in nexum_core/benches/*.rs; do
echo "=== $file ==="
grep -n 'fn bench_\|criterion_group!\|criterion_main!' "$file"
doneRepository: aviralgarg05/NexumDB
Length of output: 602
🏁 Script executed:
# Check if there's any reference to PR `#90` or issue context in git commits or files
rg -i 'issue.*90|PR.*90|INSERT.*benchmark|bench.*insert.*UPDATE.*DELETE' --type rust --type mdRepository: aviralgarg05/NexumDB
Length of output: 1265
🏁 Script executed:
# Look at executor_bench.rs more closely to see criterion_group registration
sed -n '289,298p' nexum_core/benches/executor_bench.rsRepository: aviralgarg05/NexumDB
Length of output: 345
db_comparison.rs lacks comprehensive benchmark coverage. Only SELECT benchmarks are registered; INSERT, UPDATE, and DELETE benchmarks are missing.
The criterion_group! at line 88 registers only bench_selects, yet the README documents INSERT benchmark results comparing SQLite and NexumDB. The file contains INSERT statements used for test data setup (lines 40-50) but no bench_inserts function to measure INSERT performance. Similarly, UPDATE and DELETE benchmarks are absent. Other files (executor_bench.rs, sql_bench.rs) contain INSERT-related benchmarks, but these test SQL parsing and executor behavior, not SQLite comparison as shown in the README.
To align with the documented benchmark results and complete the comparison, add bench_inserts, bench_updates, and bench_deletes functions to this file, register them in the criterion_group, and implement the actual performance comparisons against SQLite.
🤖 Prompt for AI Agents
In `@nexum_core/benches/db_comparison.rs` around lines 88 - 89, Add full CRUD
benchmarks and register them: implement new benchmark functions named
bench_inserts, bench_updates, and bench_deletes (matching the existing
bench_selects pattern) that perform the same operations against NexumDB and
SQLite (use the INSERT statements already in the file for setup as a guide),
measure throughput with Criterion, and ensure each test cleans up or isolates
data between runs; then update the criterion_group! invocation to
criterion_group!(benches, bench_selects, bench_inserts, bench_updates,
bench_deletes) so all four benchmarks are run and comparable to the README
results.
| import json | ||
| import argparse | ||
| import sys | ||
| from pathlib import Path | ||
| from typing import Dict, List, Optional | ||
| import matplotlib.pyplot as plt |
There was a problem hiding this comment.
Remove unused sys import.
sys is imported on line 3 but never used anywhere in the file.
♻️ Proposed fix
import json
import argparse
-import sys
from pathlib import Path
from typing import Dict, List, Optional
import matplotlib.pyplot as plt📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import json | |
| import argparse | |
| import sys | |
| from pathlib import Path | |
| from typing import Dict, List, Optional | |
| import matplotlib.pyplot as plt | |
| import json | |
| import argparse | |
| from pathlib import Path | |
| from typing import Dict, List, Optional | |
| import matplotlib.pyplot as plt |
🤖 Prompt for AI Agents
In `@nexum_core/benches/visualize.py` around lines 1 - 6, Remove the unused sys
import from the top-level imports: delete the reference to sys in the import
block alongside json, argparse, Path, typing, and matplotlib so the file no
longer imports an unused module (ensure only the remaining imports json,
argparse, Path, Dict/List/Optional typing, and matplotlib.pyplot as plt remain).
| def find_benchmark_data(base_path: Path, targets: List[str]) -> Dict[str, float]: | ||
| """Recursively finds benchmark data matching target keywords.""" | ||
| results = {} | ||
| if not base_path.exists(): | ||
| return results | ||
|
|
||
| for json_path in base_path.rglob("estimates.json"): | ||
| path_str = str(json_path).lower() | ||
| for target in targets: | ||
| if target.lower() in path_str: | ||
| val = parse_criterion_json(json_path) | ||
| if val > 0: | ||
| results[target] = val | ||
| return results |
There was a problem hiding this comment.
rglob may match stale results from Criterion's base/ directory instead of the latest new/ run.
Criterion stores results in both new/estimates.json (latest run) and base/estimates.json (previous run). Since rglob("estimates.json") returns all matches and results[target] = val overwrites duplicates, the final value depends on filesystem traversal order — which is non-deterministic. You could end up plotting the previous run's data.
Filter to only new/estimates.json paths:
♻️ Proposed fix
for json_path in base_path.rglob("estimates.json"):
+ if "base" in json_path.parts:
+ continue # Skip previous-run baselines; use only "new" results
path_str = str(json_path).lower()🤖 Prompt for AI Agents
In `@nexum_core/benches/visualize.py` around lines 17 - 30, The
find_benchmark_data function currently uses base_path.rglob("estimates.json")
which can return both base/estimates.json and new/estimates.json and leads to
non-deterministic overwrites; update the loop to only accept the latest run by
filtering json_path to ensure it is the "new/estimates.json" file (e.g., check
json_path.name == "estimates.json" and json_path.parent.name == "new" or match
the suffix "/new/estimates.json") before calling parse_criterion_json and
assigning results[target] = val so only latest-run values are collected.
🚀 OSCG'26 Contributor Notice
NexumDB participates in the Open Source Contributor Games 2026! High-quality PRs earn you points, recognition, and networking opportunities. Please follow our contribution guidelines for maximum impact and ensure your submission meets OSCG quality standards.
Summary
Introduced a comprehensive performance benchmarking suite using the
Criterionframework. This PR establishes a baseline for comparing NexumDB's LSM-tree-based storage engine against SQLite and measures the current overhead of the AI-native execution layer.Closes #90
What Changed?
nexum_core/benches/db_comparison.rsfeaturingINSERTandSELECT(Cold vs. Cached) benchmarks.nexum_core/src/storage/mod.rsto exportStorageEnginefor external benchmarking.README.mdand a visualization script in thebenchesdirectory.Cargo.lockandexecutor/mod.rsmerge conflicts to align with the latest upstream changes.Why?
To provide data-driven insights into NexumDB's performance characteristics. This allows the community to track regressions and quantify the benefits of the Semantic Cache as the project scales.
Type of Change
Testing
Benchmarking was conducted in a Windows 11 environment using PowerShell with the required
PYO3forward compatibility flags.Local Testing Checklist
cargo fmt --all -- --check- Code formattingcargo clippy --workspace --all-targets -- -D warnings- Lintingcargo test --workspace -- --test-threads=1- All tests passcargo bench --bench db_comparison- Suite executed successfullyTesting Details
Manual verification was performed by running the benchmark suite multiple times to ensure statistical significance. Outliers were analyzed, and the
LSM-treewrite-performance advantage was verified against SQLite.Screenshots/Examples
1. Benchmark Execution Results
Performance Impact
Analysis: Identified that NexumDB's write throughput is ~2x faster than SQLite for single-row inserts (7.48ms vs 15.18ms) due to the Sled LSM-tree engine. Established a baseline of ~1.87ms for the AI-native execution layer.
Security Considerations
Documentation
demo.sh(if applicable)Pre-Merge Checklist
main🎯 OSCG'26 Quality Standards
By submitting this PR, I confirm:
Additional Notes
Sync with upstream/main was performed to resolve
Cargo.lockandmod.rsconflicts. A visualization analysis of the results is documented in thenexum_core/benches/README.md.Summary by CodeRabbit
New Features
Improvements
Documentation
Tests