Skip to content

Fix Google Batch exit code when spot claim is successfully retried#6926

Merged
bentsherman merged 2 commits intomasterfrom
6779-nextflow-prioritizes-50001-over-exitcode-even-when-the-batch-job-is-retried-and-succeeds
Mar 17, 2026
Merged

Fix Google Batch exit code when spot claim is successfully retried#6926
bentsherman merged 2 commits intomasterfrom
6779-nextflow-prioritizes-50001-over-exitcode-even-when-the-batch-job-is-retried-and-succeeds

Conversation

@jorgee
Copy link
Copy Markdown
Contributor

@jorgee jorgee commented Mar 16, 2026

close #6779
This pull request addresses the handling of exit codes for Google Batch spot instance retries and improves test coverage for these scenarios. The main change ensures that tasks retried on spot instances correctly update their exit status by reading the .exitcode file, and the associated tests are updated to cover this logic.

Improvements to spot instance exit code handling:

  • GoogleBatchTaskHandler.groovy: Added logic to check if a task has an exit status >= 50000 (indicative of a spot instance retry) and, if so, update the exit status by reading the .exitcode file.

Enhancements to test coverage:

  • GoogleBatchTaskHandlerTest.groovy: Updated mock job status handling to use the JOB_STATE variable, allowing more flexible test scenarios.
  • GoogleBatchTaskHandlerTest.groovy: Modified test cases to include scenarios where the exit code is re-read for retried spot instances, and expanded the test matrix to cover both failed and succeeded job states, as well as the number of times readExitFile() is called.

…pts > 0)

Signed-off-by: jorgee <jorge.ejarque@seqera.io>
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 16, 2026

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit a0cb6e3
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/69b95e4d7c4fbf000891bb61

@thalassemia
Copy link
Copy Markdown

I like this fix more than mine (#6848). I'll close my PR, but I can confirm that this does currently cause issues when using google.batch.maxSpotAttempts as documented in that PR.

@jorgee
Copy link
Copy Markdown
Contributor Author

jorgee commented Mar 17, 2026

I like this fix more than mine (#6848). I'll close my PR, but I can confirm that this does currently cause issues when using google.batch.maxSpotAttempts as documented in that PR.

I didn't see your PR. What kind of issues do you refer?

@thalassemia
Copy link
Copy Markdown

thalassemia commented Mar 17, 2026

I described what I was seeing in my PR description. Reading the issue you linked, it's exactly what is described there.

To clarify, I mean that Nextflow in its current form causes issues. This PR would fix it. Sorry for the confusion!

…ode-even-when-the-batch-job-is-retried-and-succeeds
@bentsherman bentsherman changed the title Fix exit code management when spot claim with succeeded autoretry Fix Google Batch exit code when spot claim is successfully retried Mar 17, 2026
@bentsherman bentsherman merged commit 76927c2 into master Mar 17, 2026
23 of 24 checks passed
@bentsherman bentsherman deleted the 6779-nextflow-prioritizes-50001-over-exitcode-even-when-the-batch-job-is-retried-and-succeeds branch March 17, 2026 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nextflow prioritizes 50001 over .exitcode even when the Batch job is retried and succeeds.

3 participants