Feature: Make PROVIDER_TIMEOUT_MS configurable via environment variable

## Feature: Make `PROVIDER_TIMEOUT_MS` configurable via environment variable

### Problem

`PROVIDER_TIMEOUT_MS` is hardcoded to `180_000` (180s) in `packages/backend/dist/routing/proxy/provider-client.js`. This value is not configurable via environment variable, database setting, or config file.

This creates two problems for self-hosted users running Manifest with providers that have unreliable response times (e.g. Ollama Cloud):

**1. Fallback chains become ineffective**

With 180s per attempt, a tier with 5 fallback models needs up to 180s × 6 = 18 minutes to exhaust the chain. In practice, the upstream client (e.g. OpenClaw gateway) times out long before Manifest reaches a working fallback.

**2. Timeout race with upstream clients**

OpenClaw's default `timeoutSeconds` is 180 — identical to Manifest's internal timeout. When both fire simultaneously, the client closes the connection first. Manifest then sees `signal.aborted = true` from the client disconnect, re-throws instead of falling back, and the fallback chain never runs. The logs show "Proxy error: The operation was aborted due to timeout" but no "Provider transport failure" entries, confirming the fallback path is bypassed.

### Root cause analysis

We traced the flow through the proxy code:

- `provider-client.js`: `fetch()` uses `AbortSignal.timeout(PROVIDER_TIMEOUT_MS)` — this is a total timeout (from request start), not an idle timeout
- `proxy-fallback.service.js`: `tryForwardToProvider()` catches the error
- `proxy-transport.js`: `isTransportError()` checks for `/timeout/i` pattern → creates synthetic 504
- `proxy.service.js`: `shouldTriggerFallback(504)` → true → fallback chain runs

This flow works correctly when Manifest's timeout fires before the client disconnects. But with matching timeouts (180s/180s), it's a race condition that Manifest usually loses.

Additional finding: mid-stream hangs (provider returns 200 OK, starts streaming, then stops sending chunks) are architecturally not fallback-capable. Once `headersSent = true`, the controller calls `res.end()` on timeout — no fallback path exists.

### Proposed solution

Add an environment variable `PROVIDER_TIMEOUT_MS` (or `PROVIDER_REQUEST_TIMEOUT`) that overrides the hardcoded value:

```javascript
const PROVIDER_TIMEOUT_MS = parseInt(process.env.PROVIDER_TIMEOUT_MS, 10) || 180_000;
```

This allows self-hosted users to set a lower timeout so the fallback chain can actually run within the upstream client's timeout window:

```yaml
# docker-compose.yml
environment:
  PROVIDER_TIMEOUT_MS: 45000
```

With 45s per attempt: primary (45s) + fallback 1 (45s) + fallback 2 (45s) = 135s total — well within a typical 300s upstream timeout, and the agent gets a response instead of an error.

### Environment

- Manifest: Docker (self-hosted, local mode)
- Upstream: OpenClaw 2026.4.14
- Affected providers: Ollama Cloud (glm-5.1:cloud, qwen3.5:cloud) — frequent silent hangs with no HTTP error, just open connections producing no data
- Fallback targets: Anthropic, OpenRouter, OpenAI — all functional but never reached due to timeout race

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Make PROVIDER_TIMEOUT_MS configurable via environment variable #1583

Feature: Make `PROVIDER_TIMEOUT_MS` configurable via environment variable

Problem

Root cause analysis

Proposed solution

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Make PROVIDER_TIMEOUT_MS configurable via environment variable #1583

Description

Feature: Make PROVIDER_TIMEOUT_MS configurable via environment variable

Problem

Root cause analysis

Proposed solution

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Feature: Make `PROVIDER_TIMEOUT_MS` configurable via environment variable