Skip to content

Latest commit

 

History

History
134 lines (99 loc) · 4.11 KB

File metadata and controls

134 lines (99 loc) · 4.11 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Overview

This is a Rust workspace containing two crates:

  • clickhouse-arrow: High-performance ClickHouse client with native protocol and Arrow integration
  • clickhouse-arrow-derive: Procedural macros for the Row derive

The project focuses on efficient data operations with ClickHouse while maintaining compatibility with the Arrow ecosystem.

Common Development Commands

Building

# Build all workspace members
cargo build

# Build with all features
cargo build --all-features

# Build release with optimizations
cargo build --release

# Build with LTO for maximum performance
cargo build --profile release-lto

Testing

# Run all tests (requires test-utils feature for integration tests)
cargo test --features test-utils

# Run specific integration test suites
cargo test --test e2e_arrow --features test-utils
cargo test --test e2e_native --features "test-utils,derive"

# Run with output visible
cargo test --features test-utils -- --nocapture

Linting and Formatting

# Format code
cargo fmt

# Run clippy with all features
cargo clippy --all-features --all-targets

# Fix clippy warnings
cargo clippy --fix

Running Examples

All examples require the test-utils feature:

cargo run --example select --features test-utils
cargo run --example insert --features test-utils
cargo run --example pool --features test-utils

# Control number of runs
EXAMPLE_RUNS=50 cargo run --example insert --features test-utils

Benchmarking

cargo bench --features test-utils
cargo bench --bench insert --features test-utils
cargo bench --bench query --features test-utils

Architecture

Core Components

  1. Client Module (src/client/)

    • builder.rs: ClientBuilder for connection configuration
    • mod.rs: Main client implementation with query/insert methods
    • Connection pooling via bb8 (optional feature)
  2. Protocol Implementation (src/native/)

    • Native ClickHouse wire protocol
    • Packet encoding/decoding
    • Compression support (LZ4, ZSTD)
  3. Data Formats

    • Arrow Format (src/arrow/): Arrow RecordBatch integration
    • Native Format (src/formats/): Internal type system
  4. Type System (src/types/)

    • Comprehensive ClickHouse type support
    • Arrow type mapping and conversion
    • Special handling for LowCardinality, Nullable, Arrays

Key Design Patterns

  1. Format Abstraction: ArrowFormat and NativeFormat traits allow switching between Arrow and native representations
  2. Streaming: All queries return async streams for memory-efficient data processing
  3. Zero-Copy: Optimized for minimal allocations during data transfer
  4. Builder Pattern: Consistent API for client configuration

Important Implementation Details

  1. Arrow Round-Trip Considerations:

    • Utf8 types default to Binary for performance (configurable via strings_as_strings)
    • Nullable(Array) is converted to Array(Nullable) by default
    • Dictionary types map to LowCardinality
  2. Compression: Automatically negotiated with server, supports LZ4 and ZSTD

  3. Connection Pooling: Optional bb8-based pooling with the pool feature

  4. Feature Flags:

    • derive: Enable Row derive macro
    • serde: JSON serialization support
    • pool: Connection pooling
    • inner_pool: Spawns multiple "inner connections" for improved concurrency
    • cloud: ClickHouse Cloud support
    • test-utils: Required for tests/examples/benchmarks

Testing Strategy

  • Unit tests are inline with modules
  • Integration tests require a ClickHouse container (handled automatically by testcontainers)
  • E2E tests cover Arrow format, native format, compatibility, and row binary
  • Benchmarks measure insert/query performance and compression

Development Notes

  • The project uses Rust 2024 edition
  • Extensive clippy lints are configured in the workspace Cargo.toml
  • Custom disallowed methods are defined in clippy.toml
  • Test containers are automatically managed, set DISABLE_CLEANUP=true to keep containers running