fix(nginx): Fix nginx startup loop by replacing curl health check with wget (closes #9794)#9801
fix(nginx): Fix nginx startup loop by replacing curl health check with wget (closes #9794)#9801janhavitupe wants to merge 4 commits intoonyx-dot-app:mainfrom
Conversation
Greptile SummaryThis PR fixes a nginx startup loop on Alpine-based containers by replacing Key changes:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant S as run-nginx.sh
participant W as wget (BusyBox)
participant A as API Server :8080/health
participant N as nginx
loop Until status 200
S->>W: wget -S -q -O /dev/null http://api:8080/health 2>&1
W->>A: GET /health
alt API ready (200)
A-->>W: HTTP/1.1 200 OK
W-->>S: stderr: " HTTP/1.1 200 OK"
S->>S: awk extracts "200", exits on first match
S->>N: Start nginx (daemon off)
else API not ready (non-2xx)
A-->>W: HTTP/1.1 503 ...
W-->>S: stderr: " HTTP/1.1 503 ..." (first line, exit)
S->>S: status_code = "503", sleep 5s
else Connection failure
W-->>S: (no HTTP/ line in stderr)
S->>S: status_code = "" → "000", sleep 5s
end
end
Reviews (4): Last reviewed commit: "Merge branch 'main' into fix/nginx-healt..." | Re-trigger Greptile |
deployment/data/nginx/run-nginx.sh
Outdated
|
|
||
| # Use wget to send a request and capture the HTTP status code | ||
| # (curl has DNS resolution issues with Docker's internal resolver on Alpine) | ||
| status_code=$(wget -S -q -O /dev/null "http://${ONYX_BACKEND_API_HOST}:8080/health" 2>&1 | awk '/HTTP\//{print $2}' | tail -1) |
There was a problem hiding this comment.
awk pattern can extract wrong field on non-2xx responses
When BusyBox wget receives a non-2xx HTTP response (e.g., 503 while the API server is still booting), it prints two lines to stderr that both match /HTTP\//:
- The response header:
HTTP/1.1 503 Service Unavailable→$2 = "503"✓ - The error message:
wget: server returned error: HTTP/1.1 503 Service Unavailable→$2 = "server"✗
Because tail -1 picks the last match, status_code ends up as "server". The subsequent [ "server" -eq 200 ] comparison produces a shell arithmetic error ([: server: integer expression expected) and clutters the logs. The retry still occurs (since the if condition is non-zero), so nginx will still wait correctly — but the error output is noisy.
Using exit after the first HTTP/ match avoids the error message line entirely:
| status_code=$(wget -S -q -O /dev/null "http://${ONYX_BACKEND_API_HOST}:8080/health" 2>&1 | awk '/HTTP\//{print $2}' | tail -1) | |
| status_code=$(wget -S -q -O /dev/null "http://${ONYX_BACKEND_API_HOST}:8080/health" 2>&1 | awk '/HTTP\//{print $2; exit}') |
Prompt To Fix With AI
This is a comment left during a code review.
Path: deployment/data/nginx/run-nginx.sh
Line: 44
Comment:
**awk pattern can extract wrong field on non-2xx responses**
When BusyBox `wget` receives a non-2xx HTTP response (e.g., 503 while the API server is still booting), it prints **two** lines to stderr that both match `/HTTP\//`:
1. The response header: ` HTTP/1.1 503 Service Unavailable` → `$2 = "503"` ✓
2. The error message: `wget: server returned error: HTTP/1.1 503 Service Unavailable` → `$2 = "server"` ✗
Because `tail -1` picks the **last** match, `status_code` ends up as `"server"`. The subsequent `[ "server" -eq 200 ]` comparison produces a shell arithmetic error (`[: server: integer expression expected`) and clutters the logs. The retry still occurs (since the `if` condition is non-zero), so **nginx will still wait correctly** — but the error output is noisy.
Using `exit` after the first `HTTP/` match avoids the error message line entirely:
```suggestion
status_code=$(wget -S -q -O /dev/null "http://${ONYX_BACKEND_API_HOST}:8080/health" 2>&1 | awk '/HTTP\//{print $2; exit}')
```
How can I resolve this? If you propose a fix, please make it concise.
Summary
Fixes an issue where nginx gets stuck in an infinite startup loop due to failed health checks when using
curlinside the Alpine-based container.Related Issue
Closes #9794
Problem
The nginx startup script performs a health check against the API server using
curl. In some environments, this consistently returns HTTP000, causing nginx to retry indefinitely:Summary by cubic
Replaced API health check
curlwithwgetand robust status parsing to stop thenginxstartup loop on Alpine (Docker DNS returning HTTP 000). Closes #9794.Written for commit 9b360a8. Summary will update on new commits.