This directory contains comprehensive performance benchmarks for the nexum_core module using the Criterion benchmarking framework.
To maintain the high performance expected of a Rust-based engine, NexumDB is continuously benchmarked against SQLite using the criterion suite.
| Operation | SQLite | NexumDB | Delta |
|---|---|---|---|
| Single INSERT | 15.18 ms | 7.48 ms | NexumDB ~2x Faster |
| Point SELECT (Cold) | 140.5 µs | 1.86 ms | SQLite Faster |
| Point SELECT (Cached) | 143.8 µs | 1.87 ms | SQLite Faster |
NexumDB’s storage engine (sled) utilizes a Log-Structured Merge-tree (LSM-tree), whereas SQLite uses a traditional B-tree.
- LSM-tree (NexumDB): Optimizes for writes by batching updates into immutable segments, leading to the 2x speedup observed in our
INSERTbenchmarks. - B-tree (SQLite): Optimized for reads. Every write requires finding a leaf node on disk, which involves more synchronous I/O.
In small-scale point lookups (1,000 rows), SQLite's raw C-speed is superior. NexumDB's current ~1.8ms latency includes:
- SQL Parsing: Converting strings to
Statementenums. - PyO3 Bridge: The overhead of crossing the Rust-Python boundary for AI-native planning.
- Semantic Caching: The current benchmark dataset is too small to show the "skip-the-disk" benefits of semantic caching, which scale exponentially with query complexity and data volume.
- Core System: Rust-based storage engine using
sled, with SQL parsing and intelligent execution. - AI Engine: Python-based semantic caching, NL translation, and RL optimization via local models.
- Integration: PyO3 bindings for seamless Rust-Python interoperability.
- Projection-Correct SELECT: Column/alias projection with schema validation.
- Schema-Safe Writes: INSERT/UPDATE validation with best-effort coercion.
- Table Management: SHOW TABLES, DESCRIBE, DROP TABLE (IF EXISTS).
- Performance Suite: Integrated benchmark framework for regression testing.
Tests the performance of the underlying storage engine operations:
- Write Throughput: Sequential write operations with different data sizes
- Read Throughput: Sequential read operations with different data sizes
- Mixed Workload: Combined read/write operations with different ratios
- Prefix Scanning: Performance of prefix-based key scanning
- Persistence: Flush and durability operations
Measures SQL parsing performance across different query types:
- CREATE TABLE: Simple and complex table definitions
- INSERT: Single and multi-row insert statements
- SELECT: Various SELECT queries with different complexity
- Mixed Workload: Typical application SQL statement patterns
- Error Handling: Performance of invalid SQL processing
- Large Queries: Performance with very large SQL statements
Tests the query execution engine performance:
- Simple SELECT: Basic table scanning operations
- Filtered SELECT: Queries with WHERE clauses
- INSERT Operations: Data insertion performance
- CREATE TABLE: Table creation overhead
- Mixed Workload: Realistic application usage patterns
- Large Datasets: Performance with substantial data volumes
Focuses on WHERE clause evaluation performance:
- Simple Comparisons: Basic equality, inequality operations
- Complex Expressions: AND, OR, nested conditions
- LIKE Patterns: Pattern matching with wildcards
- IN Lists: Performance with different list sizes
- BETWEEN Ranges: Range-based filtering
- Batch Evaluation: Filter performance across large datasets
cd nexum_core
cargo bench# Storage engine benchmarks
cargo bench --bench storage_bench
# SQL parser benchmarks
cargo bench --bench sql_bench
# Query executor benchmarks
cargo bench --bench executor_bench
# Filter evaluation benchmarks
cargo bench --bench filter_bench# Run only write throughput tests
cargo bench --bench storage_bench -- "storage_write"
# Run only SELECT parsing tests
cargo bench --bench sql_bench -- "parse_select"cargo bench
# Reports are generated in target/criterion/Criterion generates detailed reports including:
- Performance Metrics: Mean, median, standard deviation
- Throughput Measurements: Operations per second
- Regression Detection: Performance changes over time
- HTML Reports: Interactive charts and graphs
- Statistical Analysis: Confidence intervals and outlier detection
Benchmarks run automatically on pull requests to:
- Compare performance against the main branch
- Detect performance regressions
- Generate performance reports
- Upload benchmark artifacts
- Write Throughput: >10,000 ops/sec for small records
- Read Throughput: >50,000 ops/sec for cached data
- Mixed Workload: Maintain >5,000 ops/sec combined
- Simple Queries: <1ms parsing time
- Complex Queries: <10ms parsing time
- Large Queries: Linear scaling with query size
- Table Scans: >1,000 records/ms
- Filtered Queries: >500 records/ms after filtering
- Insert Operations: >1,000 records/sec
- Simple Filters: <1μs per row evaluation
- Complex Filters: <10μs per row evaluation
- Batch Processing: >100,000 rows/sec
- Minimize disk I/O operations
- Optimize key encoding/decoding
- Implement efficient caching strategies
- Use batch operations where possible
- Cache parsed query plans
- Optimize AST construction
- Minimize string allocations
- Use efficient parsing algorithms
- Implement query plan optimization
- Use indexed access when available
- Optimize memory usage patterns
- Implement parallel processing
- Short-circuit boolean expressions
- Optimize regex compilation
- Use SIMD operations where possible
- Implement predicate pushdown
When adding new benchmarks:
- Follow Naming Conventions: Use descriptive function names
- Include Multiple Scenarios: Test different data sizes and patterns
- Set Appropriate Throughput: Use
Throughput::ElementsorThroughput::Bytes - Use Realistic Data: Generate representative test datasets
- Document Performance Targets: Include expected performance ranges
- Test Edge Cases: Include boundary conditions and error cases
fn my_new_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("my_feature");
for size in [100, 1000, 10000].iter() {
group.throughput(Throughput::Elements(*size as u64));
group.bench_with_input(
BenchmarkId::new("operation", size),
size,
|b, &size| {
b.iter_batched(
|| setup_test_data(size),
|data| black_box(perform_operation(data)),
criterion::BatchSize::SmallInput,
);
},
);
}
group.finish();
}- Benchmark Timeouts: Reduce dataset sizes or increase measurement time
- Memory Usage: Use
BatchSize::SmallInputfor large datasets - Inconsistent Results: Ensure stable system conditions
- Missing Dependencies: Check that all required crates are available
- Use
cargo flamegraphfor detailed profiling - Monitor memory allocation patterns
- Check for unnecessary cloning or allocations
- Profile with different optimization levels
When contributing benchmarks:
- Ensure benchmarks are deterministic and reproducible
- Include documentation for new benchmark categories
- Update performance targets if needed
- Test benchmarks locally before submitting
- Consider the impact on CI runtime (keep benchmarks reasonably fast)
