Lightweight eBPF observability for AI workloads
A lightweight, eBPF-based observability platform designed to identify cost and performance bottlenecks in AI workloads by selectively collecting essential data such as LLM token usage and system metrics.
Traditional observability tools often introduce significant operational overhead due to excessive resource consumption, required application code changes, and complex configuration processes.
To address these limitations, our platform is built around a Rust-based eBPF agent that collects only essential data at the kernel level without any code modifications.
The agent can be deployed via Helm Charts in Kubernetes environments or as a standalone binary in traditional AI data centers, enabling cost reduction and performance optimization across heterogeneous infrastructures.
- We practice selective observability—collecting only decision-driving data directly from the kernel.
- Minimal overhead by design
- Infrastructure-agnostic: works on Kubernetes and traditional AI data centers
- Built for AI efficiency: enabling cheaper, faster, and more efficient AI workloads
| Item | Minimum Requirement |
|---|---|
| Kubernetes | v1.23+ |
| Helm | v3.0+ |
| Node CPU | 200m (request) / 1000m (limit) |
| Node Memory | 512Mi (request) / 1Gi (limit) |
| Kernel | Linux 5.8+ (eBPF support required) |
| Capabilities | CAP_BPF, CAP_NET_ADMIN, CAP_PERFMON |
Required tools:
kubectl— cluster access and verificationhelm— chart installationgit— source cloning
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo updategit clone https://github.com/honeybee-studio/honeybeepf-llm.git
cd honeybeepf-llm
# Create namespace
kubectl create namespace <your-namespace>
# 1. Install Prometheus
helm dependency build ./charts/honeybeepf-llm-prometheus
helm install honeybeepf-llm-prometheus ./charts/honeybeepf-llm-prometheus -n <your-namespace>
# 2. Install OTel Collector
helm dependency build ./charts/honeybeepf-llm-otel-collector
helm install honeybeepf-llm-otel-collector ./charts/honeybeepf-llm-otel-collector -n <your-namespace>
# 3. Install HoneybeePF agent (edit demo-template.yaml before running)
helm install honeybeepf-llm ./charts/honeybeepf-llm -n <your-namespace> \
-f ./charts/honeybeepf-llm/values.yaml \
-f ./charts/honeybeepf-llm/demo-template.yamlNote: Before installing, edit
charts/honeybeepf-llm/demo-template.yamland replace<REPLACE_ME: ...>placeholders with your actual environment values.
kubectl get pods -n <your-namespace>If all Pods show Running status, the installation is successful:
NAME READY STATUS RESTARTS AGE
honeybeepf-llm-XXXXX 1/1 Running 0 1m
honeybeepf-llm-otel-collector-XXXXX 1/1 Running 0 2m
honeybeepf-llm-prometheus-server-XXXXX 2/2 Running 0 3m
helm uninstall honeybeepf-llm -n <your-namespace>
helm uninstall honeybeepf-llm-otel-collector -n <your-namespace>
helm uninstall honeybeepf-llm-prometheus -n <your-namespace>
kubectl delete namespace <your-namespace>Once installed, the LLM probe and file access probe are already enabled via demo-template.yaml.
LLM Probe: OpenAI, Anthropic, and Gemini are supported as built-in providers by default. No additional configuration is needed for these providers. If you use private or self-hosted LLMs (e.g., Ollama, vLLM), add them to the
providersfield indemo-template.yaml. Seecharts/honeybeepf-llm/values.yamlfor the full configuration example.
kubectl logs -n <your-namespace> -l app.kubernetes.io/name=honeybeepf-llm --tail=50You should see logs indicating the LLM probe and file access probe are active.
# Port-forward Prometheus
kubectl port-forward -n <your-namespace> svc/honeybeepf-llm-prometheus-server 9090:80 &
# Open http://localhost:9090 in your browserIf metrics appear in the Prometheus UI, data collection is working correctly.
Pre-built Grafana dashboards live under docs/grafana/.
| Dashboard | File | Focus |
|---|---|---|
| LLM Cost Observability | honeybeepf-cost.json |
Requests, tokens, latency, and cost attribution by team / pod / model |
| Security & Compliance | honeybeepf-security.json |
File access monitoring and LLM ↔ file-access correlation |
- Open Grafana → Dashboards → New → Import
- Upload one of the JSON files above
- Select your Prometheus data source when prompted
- Click Import
If your Grafana is configured with sidecar.dashboards.enabled=true, create
a labeled ConfigMap and the sidecar will pick both dashboards up automatically:
kubectl -n monitoring create configmap honeybeepf-dashboards \
--from-file=docs/grafana/honeybeepf-cost.json \
--from-file=docs/grafana/honeybeepf-security.json
kubectl -n monitoring label configmap honeybeepf-dashboards \
grafana_dashboard=1See docs/grafana/README.md for the full list
of required metrics, template variables, and additional installation
options.
# Standard build (without Kubernetes support)
cargo build --release --package honeybeepf-llm
# With Kubernetes pod metadata support (namespace, pod name in metrics)
cargo build --release --features k8s --package honeybeepf-llmNote: The
k8sfeature is not enabled by default. When deploying to Kubernetes, always build with--features k8sto include pod metadata resolution. The Docker build (Dockerfile) already includes this flag.
- Issues: Use GitHub Issues for bug reports or feature requests
- PRs: Contributions must open PRs
- Guide: Follow
CONTRIBUTING.mdfor coding standards and review expectations
| Name | ID | Role | SNS | Responsibilities |
|---|---|---|---|---|
| Jundorok | Team Leader | TBU | Roadmap & Feature Development | |
| pmj-chosim | Core Dev | TBU | CI/CD & Observability | |
| sammiee5311 | Core Dev | TBU | Feature Development | |
| vanillaturtlechips | Core Dev | TBU | CI/CD & Observability |
- Languages: eBPF, Kernel, Rust
- Infrastructure: Kubernetes, Helm, OpenTelemetry, Prometheus, Grafana
- Communication: Discord, GitHub Discussions
- Phase 1: CI/CD and Observability Setup
- Phase 2: Core Module Development
- Phase 3: Monitoring and Testing
- Phase 4: Release & Operator Integrations
We track roadmap execution via GitHub Projects and release multi-architecture container images using
publish.shonce CI pipelines pass.
- GitHub Repository: github.com/honeybee-studio/honeybeepf-llm
- Helm Charts:
charts/honeybeepf-llm - Governance:
GOVERNANCE.md
- Code of Conduct: See
CODE_OF_CONDUCT.md. Report incidents privately via GitHub Issues. - Decision Process: Maintainers document proposals via Issues/Discussions with a 72-hour community review window before landing major changes.
- Meetings: We host quarterly community syncs announced in GitHub Discussions. Notes are published alongside meeting issues.
- Membership: Active contributors who review and merge work over two consecutive releases are invited to join the maintainer group.
- Source Code: Apache License 2.0 (
LICENSE). See alsoNOTICE. - Documentation: Apache License 2.0 unless otherwise noted within the document.
- Third-Party Assets: Refer to each component's directory for licensing notices.
