enhance: Implement Data-Path Health Check API for end-to-end log flow validation

**Description**
Currently, Woodpecker (in Service mode) relies on a simple "Ping-Pong" network check for liveness probes. However, this only confirms the process is running and the network is reachable; it does not guarantee that the entire data path (Log read/write) is functional.
We need a more robust Health Check API that validates whether logs can actually be written to and read from the underlying storage/buffer, covering all deployment modes.

**Proposed Logic**
The new health check should be based on real-time log activity rather than injecting dummy "heartbeat" topics.
1. Log-Level Health
- Active Logs: If a log has incoming write attempts:
  - Success within the last 10 minutes = Healthy.
  - Stalled/Blocked for > 10 minutes = Unhealthy.
- Idle Logs: If no data has been written to a specific log for a while, it should persist its last known health state (Stay Healthy if it was Healthy).
2. Global Service Health
- Partial Success: If at least one log is successfully reading/writing, the global state is Healthy.
- Global Stall: If multiple logs have pending writes but all have been stalled for > 10 minutes, the global state is Unhealthy.
- Idle System (Cold Start/No Traffic): If there is no log activity across the entire system:
  - Fallback to a storage backend check (e.g., HeadBucket or equivalent metadata check for Object Storage).
  - If the storage backend is reachable/writable, the global state is Healthy.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance: Implement Data-Path Health Check API for end-to-end log flow validation #131

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

enhance: Implement Data-Path Health Check API for end-to-end log flow validation #131

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions