Skip to content

Commit 2563cd9

Browse files
committed
ci: pull redpanda from its own registry and give it a longer health window
Two things tripped up the first CI run on developer/arr2036: - The ci.yml and ci-sanitizers.yml workflows pulled redpanda through the FreeRADIUS internal docker mirror (docker.internal.networkradius.com), but redpandadata/redpanda is not mirrored there, so every job with a redpanda service container died at 'Initialize containers'. Pull directly from docker.redpanda.com, matching what the multi-server docker-compose already does. - The multi-server redpanda service had a 60s start window and 30s of retries. On a busy self-hosted runner that's marginal - we saw kafka-produce-short_ci fail the compose-up health gate. Bump the start_period to 120s and extend retries so we allow up to ~4 minutes for the broker to come up.
1 parent d5ffa49 commit 2563cd9

3 files changed

Lines changed: 21 additions & 6 deletions

File tree

.github/workflows/ci-sanitizers.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,8 +128,12 @@ jobs:
128128
--health-timeout 5s
129129
--health-retries 5
130130
131+
#
132+
# Pulled from docker.redpanda.com rather than the FreeRADIUS internal
133+
# docker cache because redpandadata/redpanda is not mirrored there.
134+
#
131135
redpanda:
132-
image: ${{ needs.pre-ci.outputs.docker_prefix }}redpandadata/redpanda:latest
136+
image: docker.redpanda.com/redpandadata/redpanda:latest
133137
ports:
134138
- 9092:9092
135139
options: >-

.github/workflows/ci.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,8 +139,12 @@ jobs:
139139
# on port 9092. No setup script is needed: librdkafka auto-creates the
140140
# test topics on first produce.
141141
#
142+
# Pulled from docker.redpanda.com rather than the FreeRADIUS internal
143+
# docker cache because redpandadata/redpanda is not mirrored there.
144+
# Matches the image used by the multi-server test harness.
145+
#
142146
redpanda:
143-
image: ${{ needs.pre-ci.outputs.docker_prefix }}redpandadata/redpanda:latest
147+
image: docker.redpanda.com/redpandadata/redpanda:latest
144148
ports:
145149
- 9092:9092
146150
options: >-

src/tests/multi-server/environments/kafka.yml.j2

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,19 @@ services:
5050
- --check=false
5151
- --unsafe-bypass-fsync=true
5252
restart: unless-stopped
53+
#
54+
# Redpanda needs longer than you'd think on a busy CI runner: the
55+
# first-launch data directory scaffolding can take 30-60s before
56+
# rpk admin is reachable, and the cluster-health probe only returns
57+
# OK once the controller topic has settled. Give it up to ~4 min
58+
# before declaring the container unhealthy.
59+
#
5360
healthcheck:
5461
test: ["CMD-SHELL", "rpk cluster health --exit-when-healthy"]
55-
interval: 2s
56-
timeout: 5s
57-
retries: 15
58-
start_period: 60s
62+
interval: 5s
63+
timeout: 10s
64+
retries: 24
65+
start_period: 120s
5966

6067
kafka-producer1:
6168
image: freeradius-build:latest

0 commit comments

Comments
 (0)