All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- 🚀 Parallel Batch Writes - Connection pool parallelization (+92% throughput boost!)
- 🎯 Dynamic Goroutine Scaling - Automatic scaling from 3 to 50 parallel writers based on batch size
- 🔧 User-Configurable Batching - Flexible BatchSize (500-10000) and SubBatchSize (50-500) settings
- 💾 Multiple Storage Backends - DragonflyDB, BadgerDB, and RocksDB support
- 📊 Comprehensive Benchmarks - Storage comparison, batch size optimization, and parallel write tests
- 📈 Optimal BatchSize - Increased from 500 to 5000 messages (+6% throughput)
- ⚡ Optimal FlushInterval - Confirmed 10ms as sweet spot (5ms = -63%, 20ms = -72%)
- 🔄 Async Batch Writer - Auto-detects and uses parallel writes when available
- DragonflyDB: 355K msgs/sec with parallel writes
- BadgerDB: 207K msgs/sec (pure Go, persistent)
- RocksDB: 218K msgs/sec (high-performance persistent)
- Parallel Boost: +92% for pure batch writes (49K → 94K msgs/sec)
- Batch Optimization: +6% with optimal batch size (335K → 355K msgs/sec)
- Connection Pool: 1000 pre-warmed connections for zero overhead
- Parallel Sub-Batches: 25 goroutines per batch (5000 msgs ÷ 200 sub-batch size)
- Fire-and-Forget: Async writes with zero blocking
- Optimal Config: 32 shards × 5000 batch size × 10ms flush interval
- Phase 1: Object pooling (baseline established)
- Phase 2: Allocation elimination (zero-copy optimizations)
- Phase 3: Storage bypass test (identified bottleneck)
- Phase 4: Redis command reduction (-67% commands)
- Phase 5: Async writes (+27% throughput)
- Phase 6: Connection pool parallelization (+92% boost!)
- Phase 7: Batch size optimization (+6% throughput)
- Phase 8: FlushInterval validation (10ms optimal)
- None - All changes are backward compatible
- Users can disable parallel writes:
config.EnableParallelWrites = false
1.0.0 - 2025-08-14
- 🚀 Ultra-high performance message queue system
- 🔄 Lock-free MPMC queue implementation
- ⚡ Event-driven worker architecture (0% CPU when idle)
- 📈 2M+ messages/second throughput capability
- 💯 100% message reliability (zero loss)
- 🏷️ Multi-priority queue support (high, normal, low)
- 📊 Real-time monitoring and statistics
- 🌐 RESTful API interface
- 🔌 WebSocket real-time communication
- 📦 Go client library with batch operations
- 🎯 Topic-based message routing
- ⚙️ Configurable worker pools and batch processing
- 🖥️ Web-based admin UI for monitoring
- 🔧 Production-ready Docker deployment
- 📚 Comprehensive documentation and examples
- 🧪 Advanced performance benchmarking tools
- Throughput: 2,070,000+ messages/second
- Latency: Sub-microsecond processing
- Memory: Ultra-efficient with object pooling
- Scalability: Linear scaling with CPU cores
- Reliability: 100% message delivery guarantee
- Lock-free MPMC queues with atomic operations
- Cache-line optimized data structures
- SIMD-optimized batch processing
- Zero-copy memory operations
- Event-driven worker notifications
- Memory pooling for zero GC pressure
- RESTful HTTP API
- WebSocket real-time interface
- Go client library
- Batch publishing support
- Health check endpoints
- Statistics and metrics API
- Real-time performance metrics
- Queue status and statistics
- Worker pool monitoring
- Message tracing and debugging
- Dynamic configuration
- Alerts and notifications
- Comprehensive README with examples
- Go client documentation
- API reference guide
- Performance optimization guide
- Production deployment guide
- Multi-language client examples