fix(db): dispose engine connection pool at process exit#256
Open
JanSobus wants to merge 1 commit intoreanahub:masterfrom
Open
fix(db): dispose engine connection pool at process exit#256JanSobus wants to merge 1 commit intoreanahub:masterfrom
JanSobus wants to merge 1 commit intoreanahub:masterfrom
Conversation
3f82ae2 to
8bcbd14
Compare
Register engine.dispose() as an atexit handler so that all pooled database connections are cleanly closed before any process using reana-db exits. Without this, the connection pool is garbage-collected without sending PostgreSQL Terminate messages, causing the server to log "unexpected EOF on client connection with an open transaction" at the end of every CronJob run. This fixes the noise produced by the reana-retention-rules-apply, reana-resource-quota-update, reana-system-status, and reana-interactive-session-cleanup CronJob containers, all of which share the module-level engine defined here. Closes reanahub/reana#943
8bcbd14 to
1229a02
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Registers
engine.dispose()as anatexithandler inreana_db/database.pyso that all pooled database connections are cleanly closed — with a proper TCP FIN — before any process usingreana-dbexits.Root cause
The module-level
engineindatabase.pyholds a SQLAlchemy connection pool. When a short-lived process exits (e.g. a CronJob container), this pool is garbage-collected without callingengine.dispose(). Python's GC does not send PostgreSQLTerminatemessages, so the server sees an abrupt disconnect and logs:For Flask-based commands this is compounded by
teardown_appcontextcalling onlysession.remove()(which returns connections to the pool) without disposing the pool itself.Affected CronJob containers
All four CronJob containers that use
reana-dbare fixed by this single change, since they all share the same module-levelengine:reana-system-statusflask reana-admin status-reportreana-retention-rules-applyflask reana-admin retention-rules-applyreana-resource-quota-updatereana-db quota resource-usage-updatereana-interactive-session-cleanupflask reana-admin interactive-session-cleanupNote:
reana-interactive-session-cleanupwas not mentioned in the original issue but is affected by the same root cause.The fix
atexitis Python's standard library — no new package dependency. The handler fires once when the process is about to exit, regardless of how the process was started (Flask CLI, plain Click CLI, or long-running server). For long-running services, this is equally correct:engine.dispose()on graceful shutdown cleanly closes all idle pooled connections.Tests
A unit test (
tests/test_database.py) verifies thatengine.disposeis registered as an atexit handler by reloading the module with a mockedatexit.registerand asserting it was called.Verifying the fix end-to-end
The unit test confirms the registration, but the most direct proof is observing PostgreSQL's connection logs before and after. To reproduce on any deployment with access to PostgreSQL logs:
1. Enable connection logging:
2. Run a CronJob command without the fix:
3. Check PostgreSQL logs — you'll see:
4. Apply this fix and run the same command again — you'll see instead:
A clean
disconnectionentry confirms thatengine.dispose()sent the properTerminatemessage before the process exited.Closes reanahub/reana#943