pkg/lightning: fix TestRegionJobBaseWorker waitgroup race#66703
pkg/lightning: fix TestRegionJobBaseWorker waitgroup race#66703ti-chi-bot[bot] merged 2 commits intopingcap:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
📝 WalkthroughWalkthroughMoved the WaitGroup increment to occur before enqueuing the job and added an optional after-send callback to the test helper to control timing, preventing a race that could cause a negative WaitGroup counter in the job worker tests. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @D3Hunter. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/lightning/backend/local/job_worker_test.go`:
- Around line 164-171: The test uses time.Sleep in the t.Run case to wait for
worker startup; replace that with a channel handshake to make startup
deterministic: create a chan struct{} in the test, pass a preRunFn to
prepareAndExecute that closes/signals the channel when the worker is ready, and
before enqueuing/waiting on the job block until that channel is signaled; update
references to prepareAndExecute and the preRunFn callback (and the test case
name "track waitgroup before enqueue to avoid negative counter") so the
worker-first scheduling is explicit and no sleep is used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 67f83917-17ad-4d23-808b-c01597fdfea1
📒 Files selected for processing (1)
pkg/lightning/backend/local/job_worker_test.go
|
Review Complete Findings: 0 issues ℹ️ Learn more details on Pantheon AI. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #66703 +/- ##
================================================
+ Coverage 77.6902% 78.2657% +0.5755%
================================================
Files 2008 1938 -70
Lines 549382 536987 -12395
================================================
- Hits 426816 420277 -6539
+ Misses 120901 116275 -4626
+ Partials 1665 435 -1230
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
|
@D3Hunter: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: GMHDBJD, joechenrh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
@D3Hunter: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
|
@D3Hunter: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What problem does this PR solve?
Issue Number: close #66702
Problem Summary:
TestRegionJobBaseWorker/if_the_region_has_no_leader,_rescan_the_regionis flaky in nextgen CI. In test helperprepareAndExecute,jobInCh <- jobhappened beforejobWg.Add(1), so worker-sideDonecould race ahead and triggersync: negative WaitGroup counter.What changed and how does it work?
jobWg.Add(1)beforejobInCh <- job.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit