docs: outputs: add ZeroBus output plugin documentation#2537
docs: outputs: add ZeroBus output plugin documentation#2537mats16 wants to merge 3 commits intofluent:masterfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 21 minutes and 54 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new Fluent Bit ZeroBus output plugin documentation page and a SUMMARY.md entry linking to it; documents prerequisites, configuration, examples, and the record transformation order. Changes
Sequence Diagram(s)sequenceDiagram
participant FluentBit as Fluent Bit
participant Auth as OAuth2\n(Service Principal)
participant ZeroBus as ZeroBus\nEndpoint
participant Databricks as Databricks\n(Unity Catalog)
FluentBit->>Auth: Request access token (client_id, client_secret)
Auth-->>FluentBit: Return access token
FluentBit->>ZeroBus: POST records + Authorization: Bearer token
ZeroBus-->>Databricks: Deliver ingested records to Unity Catalog table
Databricks-->>ZeroBus: Ack/ingest result
ZeroBus-->>FluentBit: HTTP response (success/failure)
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Introduced new ZeroBus output plugin for sending logs to Databricks via the ZeroBus streaming ingestion interface. - Updated SUMMARY.md to include ZeroBus in the list of output plugins. - Provided detailed configuration parameters, usage examples, and record format transformations for the ZeroBus plugin. Signed-off-by: mats <mats.kazuki@gmail.com>
e387cd1 to
5dac612
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
pipeline/outputs/zerobus.md (3)
81-85: Consider rewording the ordered steps to reduce repeated sentence starts.The repeated “If …” pattern across consecutive steps is readable but slightly mechanical; a light rewrite would improve flow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` around lines 81 - 85, The bullet list is repetitive because each line starts with "If ..."; rephrase to vary sentence starts while preserving meaning by grouping related actions and using active phrasing: mention raw_log_key behavior (capture full original record as JSON and inject under configured key unless it exists), describe log_key behavior (include only specified keys), then mention time_key (inject RFC3339 timestamp with nanosecond precision unless key exists) and add_tag (inject Fluent Bit tag as _tag unless key exists); keep the examples (timestamp format) and the "unless a key with that name already exists" clause attached to each relevant item and reference the unique config names raw_log_key, log_key, time_key, and add_tag to locate the lines to edit.
23-23: Add a short secret-handling note forclient_secret.Please add guidance to avoid storing
client_secretin plaintext config files (for example, prefer environment variable substitution or secret-management integration).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` at line 23, Add a short secret-handling note for the `client_secret` field advising not to store `client_secret` in plaintext configuration files; recommend using environment variable substitution (e.g., reading from ENV) or integrating with a secrets manager (Vault, AWS Secrets Manager, etc.), and mention limiting access and rotation as best practices so consumers of `client_secret` know secure handling expectations.
11-11: Use a technical docs link instead of the vendor homepage.Linking to the general Databricks homepage is less neutral/technical than linking to the relevant product documentation page for ZeroBus/ingestion setup.
Based on learnings: ensure Markdown stays technical and neutral, and avoid promotional links.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` at line 11, The README sentence starting "The _ZeroBus_ output plugin lets you ingest log records into a Databricks table..." links to the Databricks homepage; replace that href with the official Databricks technical documentation for ZeroBus/streaming ingestion (the product docs page for ZeroBus or ingestion setup) so the link is neutral and technical. Edit pipeline/outputs/zerobus.md and update the Markdown link target in the sentence mentioning "_ZeroBus_" to point to the ZeroBus/streaming ingestion docs URL, keeping the visible text unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pipeline/outputs/zerobus.md`:
- Around line 81-85: The bullet list is repetitive because each line starts with
"If ..."; rephrase to vary sentence starts while preserving meaning by grouping
related actions and using active phrasing: mention raw_log_key behavior (capture
full original record as JSON and inject under configured key unless it exists),
describe log_key behavior (include only specified keys), then mention time_key
(inject RFC3339 timestamp with nanosecond precision unless key exists) and
add_tag (inject Fluent Bit tag as _tag unless key exists); keep the examples
(timestamp format) and the "unless a key with that name already exists" clause
attached to each relevant item and reference the unique config names
raw_log_key, log_key, time_key, and add_tag to locate the lines to edit.
- Line 23: Add a short secret-handling note for the `client_secret` field
advising not to store `client_secret` in plaintext configuration files;
recommend using environment variable substitution (e.g., reading from ENV) or
integrating with a secrets manager (Vault, AWS Secrets Manager, etc.), and
mention limiting access and rotation as best practices so consumers of
`client_secret` know secure handling expectations.
- Line 11: The README sentence starting "The _ZeroBus_ output plugin lets you
ingest log records into a Databricks table..." links to the Databricks homepage;
replace that href with the official Databricks technical documentation for
ZeroBus/streaming ingestion (the product docs page for ZeroBus or ingestion
setup) so the link is neutral and technical. Edit pipeline/outputs/zerobus.md
and update the Markdown link target in the sentence mentioning "_ZeroBus_" to
point to the ZeroBus/streaming ingestion docs URL, keeping the visible text
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3079b466-65a0-4ffc-b8c0-ec1c0a4981b3
📒 Files selected for processing (2)
SUMMARY.mdpipeline/outputs/zerobus.md
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pipeline/outputs/zerobus.md (1)
79-85: Consider varying sentence structure for better readability.The transformation steps are clear and well-ordered. However, all five steps begin with "If", which creates repetitive sentence structure. Consider rephrasing for variety while maintaining clarity.
📝 Example rewrite with varied sentence structure
Each log record is converted to a JSON object before ingestion. The plugin applies the following transformations in order: -1. If `raw_log_key` is set, the full original record is captured as a JSON string before any filtering. -2. If `log_key` is set, only the specified keys are included in the output record. -3. If `raw_log_key` is set, the captured JSON string is injected under the configured key (unless a key with that name already exists). -4. If `time_key` is set, a timestamp in RFC 3339 format with nanosecond precision (for example, `2024-01-15T10:30:00.123456789Z`) is injected (unless a key with that name already exists). -5. If `add_tag` is enabled, the Fluent Bit tag is injected as `_tag` (unless a key with that name already exists). +1. When `raw_log_key` is set, the full original record is captured as a JSON string before any filtering. +2. If `log_key` is set, only the specified keys are included in the output record. +3. The captured JSON string (if enabled) is injected under the configured `raw_log_key` (unless a key with that name already exists). +4. A timestamp in RFC 3339 format with nanosecond precision (for example, `2024-01-15T10:30:00.123456789Z`) is injected under `time_key` (unless disabled or a key with that name already exists). +5. When `add_tag` is enabled, the Fluent Bit tag is injected as `_tag` (unless a key with that name already exists).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` around lines 79 - 85, The steps all start with "If" creating repetitive phrasing; rewrite the five bullet sentences to vary sentence openings while keeping their order and meaning—e.g., start some with conditionals ("When `raw_log_key` is set..."), others with actions ("Capture the full original record as a JSON string..."), and some with clauses ("When `log_key` is set, include only the specified keys..."); preserve references to `raw_log_key`, `log_key`, `time_key`, `add_tag`, and the injected `_tag`, and retain the semantics about not overwriting existing keys and RFC 3339 nanosecond precision for `time_key`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pipeline/outputs/zerobus.md`:
- Around line 79-85: The steps all start with "If" creating repetitive phrasing;
rewrite the five bullet sentences to vary sentence openings while keeping their
order and meaning—e.g., start some with conditionals ("When `raw_log_key` is
set..."), others with actions ("Capture the full original record as a JSON
string..."), and some with clauses ("When `log_key` is set, include only the
specified keys..."); preserve references to `raw_log_key`, `log_key`,
`time_key`, `add_tag`, and the injected `_tag`, and retain the semantics about
not overwriting existing keys and RFC 3339 nanosecond precision for `time_key`.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1abe0133-81b8-4c9e-af56-09c97a8e2033
📒 Files selected for processing (2)
SUMMARY.mdpipeline/outputs/zerobus.md
✅ Files skipped from review due to trivial changes (1)
- SUMMARY.md
|
@mats16 can you link the code PR this output plugin doc is dependent on? Also the linting / vale errors need to be cleaned up. Note that you can fix the spelling issue with ZeroBus in the docs by using backticks around it. I'll review once these are addressed. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pipeline/outputs/zerobus.md (2)
11-11: Prefer a technical Databricks docs link over the marketing homepage.This sentence links to
https://www.databricks.com/; for docs tone and verifiability, link to the specific Databricks technical documentation page for Zerobus/streaming ingestion instead.Based on learnings: In the fluent-bit-docs repository, ensure all Markdown documentation remains technical and neutral. Do not include marketing-style content or promotional links, even when citing statistics. Prefer objective explanations, verifiable sources, and avoid promotional language across all .md files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` at line 11, Update the external link in the "_Zerobus_ output plugin" description to point to a technical Databricks documentation page for Zerobus/streaming ingestion instead of the marketing homepage (replace the existing https://www.databricks.com/ URL); locate the sentence beginning "The _Zerobus_ output plugin lets you ingest log records..." in zerobus.md and swap the link to a specific Databricks docs URL that documents Zerobus or streaming ingestion so the doc remains technical and verifiable.
31-53: Add a short secret-handling note nearclient_secretexamples.The examples are correct, but a one-line warning to avoid committing real
client_secretvalues (and to use env vars/secret managers) would improve security posture for copy/paste users.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` around lines 31 - 53, Add a one-line secret-handling warning next to the client_secret example in the fluent-bit.yaml zerobus output block: mention the client_secret field by name and instruct users not to commit real secrets to source control and to use environment variables or a secret manager (e.g., reference client_secret and fluent-bit.yaml/zerobus output) so copy/paste users see the security guidance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pipeline/outputs/zerobus.md`:
- Around line 81-85: The repeated sentence starts ("If …") in the transformation
steps should be rewritten to remove the lint trigger: update the five bullet
points that reference raw_log_key, log_key, time_key, and add_tag (and mention
of injected key `_tag`) so they use varied lead-ins like "When set, …", "Setting
`raw_log_key` …", or "Enabling `add_tag` …", and combine or rephrase where
logical (e.g., describe raw_log_key behavior once and then describe the
injection rule) while preserving the same semantics about capturing the original
record as JSON, including only specified keys for `log_key`, injecting captured
JSON under `raw_log_key` unless the key exists, adding an RFC 3339 nanosecond
timestamp for `time_key` unless the key exists, and injecting the Fluent Bit tag
as `_tag` when `add_tag` is enabled.
---
Nitpick comments:
In `@pipeline/outputs/zerobus.md`:
- Line 11: Update the external link in the "_Zerobus_ output plugin" description
to point to a technical Databricks documentation page for Zerobus/streaming
ingestion instead of the marketing homepage (replace the existing
https://www.databricks.com/ URL); locate the sentence beginning "The _Zerobus_
output plugin lets you ingest log records..." in zerobus.md and swap the link to
a specific Databricks docs URL that documents Zerobus or streaming ingestion so
the doc remains technical and verifiable.
- Around line 31-53: Add a one-line secret-handling warning next to the
client_secret example in the fluent-bit.yaml zerobus output block: mention the
client_secret field by name and instruct users not to commit real secrets to
source control and to use environment variables or a secret manager (e.g.,
reference client_secret and fluent-bit.yaml/zerobus output) so copy/paste users
see the security guidance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d75aed7d-874c-4392-9e0c-a9022d680d5a
📒 Files selected for processing (1)
pipeline/outputs/zerobus.md
55d4b6d to
fb04ddc
Compare
- Updated instances of "ZeroBus" to "Zerobus" for consistency throughout the documentation. - Ensured accurate representation of the Zerobus output plugin and its configuration parameters. Signed-off-by: mats <mats.kazuki@gmail.com>
- Revised the description to use "through" instead of "via" for improved readability. - Clarified the OAuth2 terminology to "OAuth 2.0" for consistency. - Ensured consistent language throughout the documentation regarding Zerobus and its configuration parameters. Signed-off-by: mats <mats.kazuki@gmail.com>
fb04ddc to
8fd9703
Compare
|
@eschabell Thanks! The actual plugin implementation PR is here: fluent/fluent-bit#11678 |
Summary
out_zerobusoutput plugin that sends logs to Databricks tables via the ZeroBus streaming ingestion interfaceendpoint,workspace_url,table_name,client_id,client_secret,add_tag,time_key,log_key,raw_log_key)Test plan
This pull request was AI-assisted by Claude.
Summary by CodeRabbit