ttl: harden running-job iteration test session usage#13
ttl: harden running-job iteration test session usage#13zanmato1984 wants to merge 6 commits intomasterfrom
Conversation
50611bf to
197c765
Compare
| ], | ||
| flaky = True, | ||
| shard_count = 4, | ||
| shard_count = 5, |
There was a problem hiding this comment.
This shard-count update is unrelated to the TTL flaky-test fix in this PR, so it breaks the minimal-scope/test-only goal. Please drop this file change (or remove commit a0550e135c2dd48d20f1d345ec8035f2f1072b70) and keep the PR focused on TestIterationOfRunningJob.\nSource: process/pr-review | Second Opinion
zanmato1984
left a comment
There was a problem hiding this comment.
NOT_LGTM.\n\nReview checks:\n1. Intent preservation: pass.\n2. Test-only/minimal scope: fail due unrelated Bazel file change (inline comment posted).\n3. Root-cause analysis: reasonable and evidence-based.\n4. Timing-only precise repro: necessary and appropriately focused.\n5. Regression coverage ( + ): pass.\n\nAction required:\n- Drop the unrelated change (or remove commit ) and keep this PR scoped to TTL test fix only.\nSource: process/pr-review | Second Opinion
|
Clarification for the review above (format-safe): NOT_LGTM. Blocking item to address:
Focus checks status:
Source: process/pr-review | Second Opinion |
|
Address update for the latest NOT_LGTM review cycle (2026-04-09):\n\n| Finding | Source | Disposition | Evidence |\n| --- | --- | --- | --- |\n| Unrelated shard_count change () should be removed; keep PR scoped to TTL test fix only. | #13 (comment) and #13 (comment) | Fixed | Reverted the unrelated Bazel change in commit . Current PR diff is scoped to only. |\n\nValidation run:\n- ok github.com/pingcap/tidb/pkg/ttl/ttlworker 6.378s\n- Result: |
|
Address update for the latest NOT_LGTM review cycle (2026-04-09):
Validation run:
|
This reverts commit 3e596ee.
|
Address update for stage r2 (2026-04-09):
This addresses the latest reintroduced scope drift while preserving the test fix behavior. |
zanmato1984
left a comment
There was a problem hiding this comment.
NOT_LGTM (r2, independent re-review at latest head 3e596ee).
Checks:
- Original test intent preserved: PASS.
- Diff minimal/test-only and scope-clean: FAIL (blocking; unrelated pushdown Bazel shard-count file change is present at current head).
- Recurrence root-cause analysis remains evidence-based: PASS (patch directly addresses long-lived session lifetime coupling).
- Timing-only precise repro retention/removal remains justified: PASS (session-timeout subtest is a focused timeout-pressure guard).
- Regression coverage quality remains adequate: PASS (normal + session-timeout subtests).
- Address-cycle consistency with latest head: FAIL (44cb58d reverted scope issue, but 3e596ee reintroduced it).
Concrete address checklist:
- Remove tests/realtikvtest/pushdowntest/BUILD.bazel change from PR head (drop/revert 3e596ee).
- Confirm final diff is scoped to pkg/ttl/ttlworker/job_manager_integration_test.go only.
- Re-run targeted TTL test and share result.
Source: process/pr-review | Second Opinion
zanmato1984
left a comment
There was a problem hiding this comment.
LGTM (r2, superseding prior interim NOT_LGTM after new address commit).
Reviewed latest PR head: c24f182
Checks:
- Original test intent preserved: PASS.
- Diff minimal/test-only and scope-clean: PASS (only pkg/ttl/ttlworker/job_manager_integration_test.go changed).
- Recurrence root-cause analysis remains evidence-based: PASS.
- Timing-only precise repro retention/removal remains justified: PASS.
- Regression coverage in normal + session-timeout subtests: PASS.
- Address-cycle consistency at latest head: PASS (reintroduced non-scope Bazel change removed by c24f182).
No material blockers remain.
Source: process/pr-review | Second Opinion
Summary
TestIterationOfRunningJobsession-timeoutsubtest to cover slow-loop timeout pressure explicitlyRoot cause
The earlier race fix switched this test to internal sessions, but it still reused a single session for the whole loop. Under slower execution, that session can exceed its timeout and fail unrelated assertions.
Validation
GOMAXPROCS=2 go test -p 1 -vet=off ./pkg/ttl/ttlworker -run '^TestIterationOfRunningJob$' -count=1~/Library/Caches/go-build), not by test assertions