Summary
reana-job-controller creates Kubernetes Jobs for each workflow step but never sets ttlSecondsAfterFinished on them. Completed and failed Jobs accumulate indefinitely, with no automatic cleanup.
Observed behaviour
On our deployment, after normal operation we observed:
- 246 completed (succeeded) Jobs sitting in the namespace
- 125 failed Jobs sitting in the namespace
- 970 total Jobs in the reana namespace
This grows unboundedly as workflows run.
Root cause
In reana_job_controller/kubernetes_job_manager.py, the V1Job spec is constructed without setting ttl_seconds_after_finished. Kubernetes's built-in TTL controller would automatically garbage-collect completed Jobs if this field were set, but since it is not, nothing removes them.
Impact
- etcd bloat: Every Job object (plus its pod) is stored in etcd. At scale this creates memory and API server pressure.
- Namespace clutter:
kubectl get jobs -n reana becomes unreadable with hundreds of stale entries.
- Pod accumulation: Completed Job pods also persist, consuming entries in the kubelet's pod cache and in
kubectl get pods output.
Suggested fix
Set ttl_seconds_after_finished when constructing the Job spec in kubernetes_job_manager.py, configurable via an environment variable (consistent with the existing REANA_KUBERNETES_JOBS_TIMEOUT_LIMIT pattern):
REANA_KUBERNETES_JOBS_TTL_SECONDS_AFTER_FINISHED = int(
os.getenv("REANA_KUBERNETES_JOBS_TTL_SECONDS_AFTER_FINISHED", 604800) # 7 days default
)
# when building V1JobSpec:
ttl_seconds_after_finished=REANA_KUBERNETES_JOBS_TTL_SECONDS_AFTER_FINISHED,
A 7-day default gives operators enough time to inspect failed jobs for debugging while still reclaiming resources automatically.
Workaround
Until this is fixed upstream, operators can run a periodic cleanup CronJob:
# Delete completed jobs
kubectl delete jobs -n reana --field-selector=status.successful=1 --ignore-not-found=true
# Delete failed jobs
kubectl get jobs -n reana -o json | jq -r '
.items[] |
select((.status.active // 0) == 0) |
select((.status.succeeded // 0) == 0) |
select((.status.failed // 0) > 0) |
.metadata.name
' | xargs -r kubectl delete job -n reana --ignore-not-found=true
Environment
- REANA version: 0.9.5-alpha.1 (reana-server), 0.9.4-alpha.1 (reana-job-controller)
- Kubernetes: v1.33.10
Summary
reana-job-controllercreates Kubernetes Jobs for each workflow step but never setsttlSecondsAfterFinishedon them. Completed and failed Jobs accumulate indefinitely, with no automatic cleanup.Observed behaviour
On our deployment, after normal operation we observed:
This grows unboundedly as workflows run.
Root cause
In
reana_job_controller/kubernetes_job_manager.py, theV1Jobspec is constructed without settingttl_seconds_after_finished. Kubernetes's built-in TTL controller would automatically garbage-collect completed Jobs if this field were set, but since it is not, nothing removes them.Impact
kubectl get jobs -n reanabecomes unreadable with hundreds of stale entries.kubectl get podsoutput.Suggested fix
Set
ttl_seconds_after_finishedwhen constructing the Job spec inkubernetes_job_manager.py, configurable via an environment variable (consistent with the existingREANA_KUBERNETES_JOBS_TIMEOUT_LIMITpattern):A 7-day default gives operators enough time to inspect failed jobs for debugging while still reclaiming resources automatically.
Workaround
Until this is fixed upstream, operators can run a periodic cleanup CronJob:
Environment