fix(nf-tower): ensure final log checkpoint runs after thread interrupt#7097
fix(nf-tower): ensure final log checkpoint runs after thread interrupt#7097rnaidu-seqera wants to merge 2 commits intonextflow-io:masterfrom
Conversation
✅ Deploy Preview for nextflow-docs-staging canceled.
|
Signed-off-by: Rashmi Naidu <rashmi.naidu@seqera.io>
Signed-off-by: Rashmi Naidu <rashmi.naidu@seqera.io>
ea5c083 to
5fa9b4e
Compare
| if( Thread.currentThread().isInterrupted() ) | ||
| break | ||
| final interrupted = await(interval) | ||
| Thread.interrupted() // clear flag so NIO writes in saveFiles() succeed |
There was a problem hiding this comment.
Reading the description of the PR I have not clear what was causing the issue and why you set Thread.interrupted() in all the cases. Is it not causing issues in the normal execution?
There was a problem hiding this comment.
I think the main mistake is to use interrupt to signal the termination that was introduced in previous iteration.
interrupt should be use to force the thread interruption (that's not the case).
I'd suggest going in the change history, revert that implementation and proper exiting the thread
There was a problem hiding this comment.
I think it was added in #6787. I see it is in stable 25.10.x. But this branch doesn't include #6939. So, maybe the reported problem is caused by the race condition and interrupt is happening during upload. Because, if the thread is interrupted, it should break without calling saveFiles.
@rnaidu-seqera, Is the issue happening in 25.10.4?
Summary
LogsCheckpoint.await()was re-setting the thread interrupt flag after catchingInterruptedException, causing subsequent NIO writes insaveFiles()tothrow
ClosedByInterruptExceptionand leave emptynf-*.txt,nf-*.log, andtimeline-*.htmlfiles in the GCS work directoryawait()to return a boolean instead of re-setting the flag, and callingThread.interrupted()inrun()to clear the flag beforesaveFiles()executes — ensuring the final upload always succeeds regardless of interrupt sourcesynchronized(lock)guard soonFlowComplete()cannot interrupt the thread mid-uploadTest plan
LogsCheckpointTesttests pass (interval configuration via default, env var, and config file)'should perform final saveFiles when interrupted mid-sleep'passes — verifieshandler.saveFiles()is called exactly once whenonFlowComplete()interrupts the thread mid-sleep, and that the thread interrupt flag is clear at the time of the callcloudcache.enabled = true:nf-*.txtfile uploaded to GCS with full content after workflow completion (previously 0 bytes)