Skip to content

SQLAlchemy cronjobs do not cleanly close their connections #943

@kratsg

Description

@kratsg

The SQLAlchemy sessions in the CronJob pods (reana-retention-rules-apply, reana-resource-quota-update, reana-system-status) are not cleanly closed when the container process exits. PostgreSQL logs this as:

LOG: unexpected EOF on client connection with an open transaction

The timestamps we see in our server match cron job schedules exactly (UTC+2 = CEST):

  • 02:01 CEST (00:01 UTC) → reana-system-status (0 0 * * *)
  • 04:00 CEST (02:00 UTC) → reana-retention-rules-apply (0 2 * * *)
  • 02:00 CEST (00:00 UTC April 19) → reana-system-status again

It seems Flask-SQLAlchemy's teardown_appcontext calls db.session.remove(), which returns the connection to the pool but doesn't call connection.close(). When the process exits, the pool is destroyed without sending the PostgreSQL Terminate message, so the server sees an unexpected EOF. This is benign from a data-integrity standpoint (PostgreSQL rolls back any open transaction), but it generates noise and leaves zombie connections for minutes until TCP timeout.

Recommended upstream fix (in order of preference):

  1. Call db.engine.dispose() at the end of each CLI command — this explicitly closes all pooled connections before process exit.
  2. Or configure NullPool for the Flask app when running in CLI mode (not as a server), so no connection pooling occurs and connections are closed immediately after each use: from sqlalchemy.pool import NullPool; app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {'poolclass': NullPool}
  3. Or wrap CLI commands with an atexit handler that calls db.engine.dispose().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions