Description
Currently, the backup jobs created by the operator do not specify a backoffLimit, causing them to default to the Kubernetes standard of 6 retries. When a backup fails, this results in the creation of multiple failing pods (e.g., devworkspace-backup-xxxxx), which can clutter the namespace and consume unnecessary resources.
We need the ability to configure the .spec.backoffLimit for these backup jobs, ideally through the DevWorkspaceOperatorConfig (DWOC)'s backupConfig, to allow users to control the retry behavior.
Current failing backup pod behavior:
NAME READY STATUS RESTARTS AGE
devworkspace-backup-wwmkr-2fl56 0/1 Error 0 69s
devworkspace-backup-wwmkr-86g6g 0/1 Error 0 2m32s
devworkspace-backup-wwmkr-v6d4p 0/1 Error 0 3m39s
devworkspace-backup-wwmkr-vqxxh 0/1 Error 0 3m53s
devworkspace-backup-wwmkr-znz7k 0/1 Error 0 3m16s
Acceptance Criteria
Additional Context
Description
Currently, the backup jobs created by the operator do not specify a
backoffLimit, causing them to default to the Kubernetes standard of 6 retries. When a backup fails, this results in the creation of multiple failing pods (e.g.,devworkspace-backup-xxxxx), which can clutter the namespace and consume unnecessary resources.We need the ability to configure the
.spec.backoffLimitfor these backup jobs, ideally through theDevWorkspaceOperatorConfig(DWOC)'sbackupConfig, to allow users to control the retry behavior.Current failing backup pod behavior:
Acceptance Criteria
DevWorkspaceOperatorConfig(DWOC) to define thebackoffLimitfor backup jobs.backupcronjob_controller.goto inject this configured value into the Job.spec.backoffLimit.backoffLimit(e.g., 1 or 2) successfully limits the number of pods created upon backup failure.Additional Context