Skip to content

Commit 3a86485

Browse files
Docs
1 parent 9f1b888 commit 3a86485

3 files changed

Lines changed: 136 additions & 10 deletions

File tree

CLAUDE.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,55 @@ cargo run --features=flightsql -- serve-flightsql
3333
cargo run -- generate-tpch
3434
```
3535

36+
### Benchmarking
37+
38+
Benchmarks measure query performance with detailed timing breakdowns:
39+
40+
```bash
41+
# Serial benchmark (default, 10 iterations)
42+
cargo run -- -c "SELECT 1" --bench
43+
44+
# Custom iteration count
45+
cargo run -- -c "SELECT 1" --bench -n 100
46+
47+
# Concurrent benchmark (measures throughput under load)
48+
cargo run -- -c "SELECT 1" --bench --concurrent
49+
50+
# With custom iterations and concurrency
51+
cargo run -- -c "SELECT 1" --bench -n 100 --concurrent
52+
53+
# Save results to CSV
54+
cargo run -- -c "SELECT 1" --bench --save results.csv
55+
56+
# Append to existing results
57+
cargo run -- -c "SELECT 2" --bench --concurrent --save results.csv --append
58+
59+
# Warm up cache before benchmarking
60+
cargo run -- -c "SELECT * FROM t" --bench --run-before "CREATE TABLE t AS VALUES (1)"
61+
```
62+
63+
**Benchmark Modes:**
64+
- **Serial** (default): Measures query performance in isolation
65+
- Shows pure query execution time without contention
66+
- Ideal for understanding baseline performance
67+
68+
- **Concurrent** (`--concurrent`): Measures performance under load
69+
- Runs iterations in parallel (concurrency = min(iterations, CPU cores))
70+
- Shows throughput (queries/second) with multiple clients
71+
- Reveals resource contention and bottlenecks
72+
- Higher mean/median times are expected due to concurrent load
73+
74+
**Output:**
75+
- Timing breakdown: logical planning, physical planning, execution, total
76+
- Statistics: min, max, mean, median for each phase
77+
- CSV format includes `concurrency_mode` column (serial or concurrent(N))
78+
79+
**FlightSQL Benchmarks:**
80+
```bash
81+
# Benchmark FlightSQL server (requires --flightsql flag and server running)
82+
cargo run -- -c "SELECT 1" --bench --flightsql --concurrent
83+
```
84+
3685
### Testing
3786

3887
Tests are organized by feature and component:

README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,12 @@ dft -f query.sql
6868
# Benchmark a query (with stats)
6969
dft -c "SELECT * FROM my_table" --bench
7070

71+
# Concurrent benchmark (measures throughput under load)
72+
dft -c "SELECT * FROM my_table" --bench --concurrent
73+
74+
# Save benchmark results to CSV
75+
dft -c "SELECT * FROM my_table" --bench --save results.csv
76+
7177
# Start FlightSQL Server (requires `flightsql` feature)
7278
dft serve-flightsql
7379

@@ -78,6 +84,39 @@ dft serve-http
7884
dft generate-tpch
7985
```
8086

87+
### Benchmarking
88+
89+
`dft` includes built-in benchmarking to measure query performance with detailed timing breakdowns:
90+
91+
```sh
92+
# Serial benchmark (default) - measures query performance in isolation
93+
dft -c "SELECT * FROM my_table" --bench
94+
95+
# Concurrent benchmark - measures throughput under load
96+
dft -c "SELECT * FROM my_table" --bench --concurrent
97+
98+
# Custom iteration count
99+
dft -c "SELECT * FROM my_table" --bench -n 100
100+
101+
# Save results to CSV for analysis
102+
dft -c "SELECT * FROM my_table" --bench --save results.csv
103+
104+
# Compare serial vs concurrent performance
105+
dft -c "SELECT * FROM my_table" --bench --save results.csv
106+
dft -c "SELECT * FROM my_table" --bench --concurrent --save results.csv --append
107+
```
108+
109+
**Benchmark Output:**
110+
- Timing breakdown by phase: logical planning, physical planning, execution
111+
- Statistics: min, max, mean, median for each phase
112+
- Row counts validation across all runs
113+
- CSV export with `concurrency_mode` column for result comparison
114+
115+
**Serial vs Concurrent:**
116+
- **Serial**: Pure query execution time without contention (baseline performance)
117+
- **Concurrent**: Throughput measurement with parallel execution (reveals bottlenecks and contention)
118+
- Concurrent mode uses adaptive concurrency: `min(iterations, CPU cores)`
119+
81120
### Setting Up Tables with DDL
82121

83122
`dft` can automatically load table definitions at startup, giving you a persistent "database-like" experience.

docs/cli.md

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -64,31 +64,69 @@ basic_auth.password = "Pass"
6464

6565
## Benchmark Queries
6666

67-
You can benchmark queries by adding the `--bench` parameter. This will run the query a configurable number of times and output a breakdown of the queries execution time with summary statistics for each component of the query (logical planning, physical planning, execution time, and total time).
67+
You can benchmark queries by adding the `--bench` parameter. This will run the query a configurable number of times and output a breakdown of the query's execution time with summary statistics for each component (logical planning, physical planning, execution time, and total time).
6868

69-
Optionally you can use the `--run-before` param to run a query before the benchmark is run. This is useful in cases where you want to hit a temp table or write a file to disk that your benchmark query will use.
69+
### Benchmark Modes
7070

71-
To save benchmark results to a file use the `--save` parameter with a file path. Further, you can use the `--append` parameter to append to the file instead of overwriting it.
71+
**Serial Benchmark (default):**
72+
Measures query performance in isolation, running iterations one after another. This shows the pure query execution time without any contention or resource sharing overhead.
7273

73-
The number of benchmark iterations is defined in your configuration (default is 10) and can be configured per benchmark run with `-n` parameter.
74+
**Concurrent Benchmark (`--concurrent`):**
75+
Measures query performance under load by running iterations in parallel. This reveals:
76+
- Throughput (queries per second) with multiple concurrent clients
77+
- Resource contention and bottlenecks
78+
- Performance degradation under concurrent load
7479

80+
Concurrent mode uses adaptive concurrency: `min(iterations, CPU cores)` to avoid overwhelming the system.
81+
82+
### Options
83+
84+
- **`--bench`**: Enable benchmarking mode
85+
- **`--concurrent`**: Run iterations in parallel (for concurrent benchmarking)
86+
- **`-n <count>`**: Number of iterations (default: 10, configured in config file)
87+
- **`--run-before <query>`**: Run a setup query before benchmarking (useful for cache warming)
88+
- **`--save <file>`**: Save results to CSV file
89+
- **`--append`**: Append to existing results file instead of overwriting
90+
91+
### Examples
7592

7693
```sh
94+
# Serial benchmark (default)
7795
dft -c "SELECT * FROM my_table" --bench
7896

79-
# Run a configurable number of benchmark iterations
80-
dft -c "SELECT ..." --bench -n 5
97+
# Concurrent benchmark
98+
dft -c "SELECT * FROM my_table" --bench --concurrent
99+
100+
# Custom iteration count
101+
dft -c "SELECT ..." --bench -n 100
102+
103+
# Concurrent with custom iterations
104+
dft -c "SELECT ..." --bench -n 100 --concurrent
81105

82-
# Save benchmark results to a file
106+
# Save benchmark results to CSV
83107
dft -c "SELECT ..." --bench --save results.csv
84108

85-
# Append benchmark results to existing file
86-
dft -c "SELECT ..." --bench --save results.csv --append
109+
# Append results (compare serial vs concurrent)
110+
dft -c "SELECT ..." --bench --save results.csv
111+
dft -c "SELECT ..." --bench --concurrent --save results.csv --append
87112

88-
# Run a setup query prior to running benchmark. This can be useful to quickly iterate on various paramters
113+
# Run a setup query before benchmarking
89114
dft -c "SELECT ..." --bench --run-before="CREATE TEMP TABLE my_temp AS SELECT ..."
115+
116+
# FlightSQL benchmark (concurrent)
117+
dft -c "SELECT ..." --bench --concurrent --flightsql
90118
```
91119

120+
### Output
121+
122+
Benchmark output includes:
123+
- **Mode**: `serial` or `concurrent(N)` where N is the concurrency level
124+
- **Timing breakdown**: Logical planning, physical planning, execution (min/max/mean/median)
125+
- **Row counts**: Validation that all runs returned the same number of rows
126+
- **CSV format**: Results include a `concurrency_mode` column for comparison
127+
128+
**Note**: Concurrent benchmarks typically show higher mean/median times due to resource contention - this is expected and reveals how the system performs under load.
129+
92130
## Analyze Queries
93131

94132
The output from `EXPLAIN ANALYZE` provides a wealth of information on a queries execution - however, the amount of information and connecting the dots can be difficult and manual. Further, there is detail in the `MetricSet`'s of the underlying `ExecutionPlan`'s that is lost in the output.

0 commit comments

Comments
 (0)