A high-performance columnar storage engine built on Apache Arrow, designed for vector databases and analytical workloads.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ Application Layer โ Filesystem โ
โ (Python / Java / Rust / C++) โ FFI (C ABI) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโฌโโโโโโโ
โ โ
โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FFI Layer (extern "C") โ
โ (Cross-language bindings via C ABI) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโโโโดโโโโโโโโโโโโ โ
โผ โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Writer API โ โ Reader API โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Column Group Policy โ โ โ โ RecordBatchReader (Scan) โ โ
โ โ (Single/Schema/Size Based) โ โ โ โ ChunkReader (Random Access) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ โ Take (Row Indices) โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโโโโโโโฌโโโโโโโโโโโโ โ
โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Transaction Layer โ
โ (Manifest Versioning / Conflict Resolution / Delta Logs) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโโโโดโโโโโโโโโโโโ โ
โผ โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Column Group Writer โ โ Column Group Reader โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Buffer Management โ โ โ โ Chunk Management โ โ
โ โ Row Group Sizing โ โ โ โ Column Projection โ โ
โ โ File Rolling โ โ โ โ Predicate Pushdown โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโโโโโโโฌโโโโโโโโโโโโ โ
โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Format Layer โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Parquet โ โ Vortex โ โ Lance (Read Only) โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Filesystem Layer โ
โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โ
โ โ Local โ โAWS S3 โ โ GCS โ โ Azure โ โAliyun โ โTencent โ โ Huawei โ โ
โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Column Group Storage - Organize columns into groups for optimal I/O and compression
- Multi-Format Support - Parquet (primary), Vortex, and Lance formats
- Transaction Support - ACID-like semantics with manifest versioning and conflict resolution
- Cloud Native - Built-in support for major cloud storage providers
- Multi-Language SDKs - Python, Java/Scala, Rust, and C++ bindings
- Encryption & Compression - Data-at-rest encryption and configurable compression
| Provider | Status |
|---|---|
| AWS S3 | Supported (including S3-compatible: MinIO, Cloudflare R2) |
| Google Cloud Storage | Supported |
| Azure Blob Storage | Supported |
| Aliyun OSS | Supported |
| Tencent COS | Supported |
| Huawei Cloud OBS | Supported |
| Language | Status | Notes |
|---|---|---|
| C++ | Primary | Core implementation |
| Python | Supported | FFI bindings with PyArrow integration |
| Java/Scala | Supported | JNI bindings |
| Rust | Supported | DataFusion TableProvider integration |
- CMake >= 3.20.0
- C++17 compiler (GCC 8+, Clang 6+)
- Conan >= 2.0
git clone https://github.com/milvus-io/milvus-storage.git
cd milvus-storage/cpp
# Initialize Conan default profile (required for Conan 2.x)
conan profile detect --force
# Setup Conan remote artifactory
conan remote add default-conan-local2 https://milvus01.jfrog.io/artifactory/api/conan/default-conan-local2
# Build
make build
# Test
make test
# Test with minio
make test-all
# Run benchmarks
./build/Release/benchmark/benchmark --benchmark_filter="Typical/"# Build C++ library for Python FFI
cd cpp && make python-lib && cd ..
# Install Python FFI package
cd python && pip install -e ".[dev]" && cd ..
# Install test dependencies
pip install -r tests/requirements.txt
# Run integration tests
cd tests && pytest integration/ -v
# Run stress tests (quick validation)
cd tests && pytest stress/ --stress-scale=0.01 -vSee docs/integration-test-design.md for test design details.
| Option | Description |
|---|---|
BUILD_TYPE=Debug/Release |
Build type |
WITH_AZURE_FS=ON |
Azure filesystem support |
WITH_JNI=ON |
Java JNI bindings |
WITH_PYTHON_BINDING=ON |
Python bindings |
See python/tests/test_write_read.py for Python usage examples.
For comprehensive integration tests, see tests/integration/.
For benchmarks, see cpp/benchmark/benchmark.md and docs/multi-format-benchmark-design.md.
make fix-format
make fix-tidyContributions are welcome. Please ensure code follows the project style and tests pass.
Apache License 2.0. See LICENSE for details.