This document provides a technical overview of the project and highlights the most important packages and concepts.
access-log-exporter is a high-performance Go application that acts as a Prometheus exporter for access logs. It receives log messages via syslog protocol, parses them according to configurable rules, and exposes the extracted metrics in Prometheus format.
- High-throughput processing: Concurrent worker architecture for processing thousands of log lines per second
- Flexible configuration: YAML-based configuration with support for multiple metric presets
- Multiple metric types: Support for Prometheus counters, gauges, and histograms
- Label processing: Advanced label extraction with regular expression replacements and user agent parsing
- Upstream support: Special handling for load balancer upstream servers
- Memory efficient: Uses sync.Pool for object reuse to minimize garbage collection pressure
- Thread-safe: Designed for concurrent access across multiple goroutines
The application follows a pipeline architecture with the following main components:
Syslog Server → Message Buffer → Worker Pool → Metric Processing → Prometheus Export
- Syslog Server (
internal/syslog): Receives and parses syslog messages - Collector (
internal/collector): Manages worker pool and coordinates metric processing - Metric Engine (
internal/metric): Processes log lines and updates Prometheus metrics - Configuration (
internal/config): Handles YAML configuration and validation - HTTP Server: Exposes
/metricsendpoint for Prometheus scraping
The application starts by:
- Configuration Loading: Reads YAML configuration file or environment variables
- Preset Selection: Chooses the active metric preset from configuration
- Syslog Server Setup: Creates UDP or Unix socket listener for syslog messages
- Metric Initialization: Creates Prometheus metric collectors based on configuration
- Worker Pool: Spawns concurrent workers (defaults to number of CPU cores)
- HTTP Server: Starts web server for
/metricsand/healthendpoints
// Syslog server receives messages on UDP/Unix socket
// Strips syslog headers and extracts the actual log message
// Sends cleaned message to buffered channelThe syslog component:
- Listens on UDP or Unix domain sockets
- Parses syslog format messages (e.g.,
<34>Oct 11 22:14:15 nginx: actual_log_message) - Extracts the actual log message after the third colon
- Uses a buffer pool to minimize memory allocations
// Multiple worker goroutines process messages concurrently
func (c *Collector) lineHandlerWorker(ctx context.Context, logger *slog.Logger, messageCh <-chan string) {
for msg := range messageCh {
// Split message into fields (tab-separated)
line := strings.Split(msg, "\t")
// Process each configured metric
for _, metric := range c.metrics {
metric.Parse(line)
}
}
}Workers operate independently and process messages from a shared channel, providing high throughput through parallel processing.
// Each metric processes the log line according to its configuration
func (m *Metric) Parse(line []string) error {
// 1. Validate line format and extract value
// 2. Get labels map from sync.Pool (thread-safe reuse)
// 3. Process each configured label
// 4. Apply transformations (user agent parsing, regular expression replacements)
// 5. Set metric value (counter, gauge, or histogram)
// 6. Return labels map to pool
}The configuration system supports:
presets:
nginx:
metrics:
- name: http_requests_total
type: counter
help: "Total HTTP requests"
labels:
- name: host
lineIndex: 0
- name: method
lineIndex: 1- Counter: Monotonically increasing values (request counts, error counts)
- Gauge: Values that can go up and down (response times, queue sizes)
- Histogram: Distribution of values (response time distributions)
- Math transformations: Apply multiplication/division to metric values
- Label replacements: Use regular expression to transform label values
- User agent parsing: Extract browser family from user agent strings
- Upstream handling: Special support for load balancer upstream servers
- sync.Pool: Reuses
prometheus.Labelsmaps across goroutines to reduce allocations - Buffer pooling: Syslog server reuses byte buffers for reading messages
- Pre-sized allocations: Maps and slices are allocated with known capacity
- Worker pool: Parallel processing across multiple CPU cores
- Thread-safe design: All components designed for concurrent access
- Channel-based communication: Non-blocking message passing between components
- Bounds check elimination: Uses Go compiler hints to eliminate array bounds checks
- Efficient string operations: Uses
strings.IndexBytefor fast comma parsing - Regular expression optimization: Only calls regular expression replacement when match is found
Core metric processing engine:
Parse(): Main entry point for processing log linessetMetric(): Handles value parsing and Prometheus metric updateslabelValueReplacements(): Applies regular expression transformations to labels- Thread-safe design using sync.Pool for label map reuse
Manages the worker pool and coordinates metric processing:
lineHandlerWorkers(): Creates concurrent worker goroutineslineHandlerWorker(): Individual worker that processes messages- Implements Prometheus collector interface
Handles syslog protocol reception:
- Supports UDP and Unix domain sockets
- Parses syslog format and extracts log messages
- Uses buffer pooling for high-performance message processing
Configuration management and validation:
- YAML/JSON configuration parsing
- Environment variable support
- Configuration validation and defaults
- Support for multiple metric presets
The exporter includes built-in metrics:
log_parse_errors_total: Counter of parsing errorslog_last_received_timestamp_seconds: Timestamp of last received message- Standard Go runtime metrics (memory, GC, goroutines)
- Optional nginx stub_status metrics
The project includes comprehensive benchmarks:
BenchmarkMetricParseSimple: Tests basic metric parsing performanceBenchmarkMetricParseUserAgent: Tests user agent parsing overheadBenchmarkMetricParseUpstream: Tests upstream processing performance
Performance targets:
- Zero allocations in the hot path for simple metrics
- Sub-microsecond processing time per log line
- Scales linearly with number of CPU cores
- Code formatting: Run
make fmtto format Go code - Linting: Run
make lintto check code quality - Testing: Run
make testto execute test suite - Benchmarking: Use
go test -bench=.to measure performance
The application is designed for high-throughput log processing environments where performance and memory efficiency are critical requirements.