Skip to content

feat: migrate validation to ValidatingAdmissionPolicy (K8s 1.30+) #315

@MaxRink

Description

@MaxRink

Summary

Migrate complex validation logic to Kubernetes ValidatingAdmissionPolicy (GA in K8s 1.30) for improved performance and reliability.

Background

Currently, breakglass uses validating webhooks for admission validation. Kubernetes 1.30 introduced ValidatingAdmissionPolicy which offers:

  • No webhook latency: Validation runs in API server process
  • Higher availability: No external webhook dependency
  • Built-in audit logging: Validation decisions are audit logged
  • CEL-based: Consistent with CRD validation rules

Proposed Migration

Phase 1: Supplement Webhooks (Non-breaking)

Deploy ValidatingAdmissionPolicy alongside existing webhooks:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: breakglass-session-limits
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: ["breakglass.t-caas.telekom.com"]
        apiVersions: ["v1alpha1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["breakglasssessions"]
  validations:
    - expression: "object.spec.requestedDuration.duration <= duration('8h')"
      message: "Session duration cannot exceed 8 hours"
      reason: Invalid
    - expression: "object.spec.reason.size() >= 10"
      message: "Reason must be at least 10 characters"
      reason: Invalid
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: breakglass-session-limits-binding
spec:
  policyName: breakglass-session-limits
  validationActions:
    - Deny
  matchResources:
    namespaceSelector:
      matchLabels:
        breakglass.t-caas.telekom.com/enabled: "true"

Phase 2: Complex Validations

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: breakglass-escalation-limits
spec:
  matchConstraints:
    resourceRules:
      - apiGroups: ["breakglass.t-caas.telekom.com"]
        resources: ["breakglassescalations"]
  paramKind:
    apiVersion: breakglass.t-caas.telekom.com/v1alpha1
    kind: ClusterConfig
  validations:
    # Max session duration based on cluster config
    - expression: "object.spec.maxValidFor.duration <= params.spec.maxSessionDuration.duration"
      message: "maxValidFor exceeds cluster maximum"
    # Session limits per escalation
    - expression: "object.status.activeSessions < params.spec.maxConcurrentSessions"
      message: "Maximum concurrent sessions reached"

Current Codebase Context

  • Existing validating webhooks in config/webhook/ and internal/ handle admission
  • No ValidatingAdmissionPolicy resources exist yet
  • No CEL validations on CRDs yet (see related issue feat: add CEL validation expressions to CRDs #313)
  • The Helm chart at charts/k8s-breakglass/ would need conditional VAP deployment

Migration Path

Phase Action Kubernetes Version
1 Add VAP alongside webhooks (audit mode) 1.30+
2 Move basic validations to VAP (deny mode) 1.30+
3 Move complex validations with paramKind 1.30+
4 Deprecate webhook for VAP-covered cases 1.32+

Acceptance Criteria

Phase 1 (Minimum Viable)

  • Create ValidatingAdmissionPolicy for basic session validations (duration limits, reason length)
  • Create ValidatingAdmissionPolicyBinding with namespace selector
  • Add VAP resources to Helm chart (conditional on K8s version or feature flag)
  • Deploy in Warn mode first, then Deny
  • Test alongside existing webhook (no functional change)

Phase 2 (Full Migration)

  • Add parameterized policies using paramKind (ClusterConfig-based limits)
  • Add escalation-specific validations
  • Document migration path for operators
  • E2E tests for VAP-based validation

Phase 3 (Cleanup)

  • Remove webhook validations that are fully covered by VAP
  • Update documentation to reflect VAP as primary validation mechanism

Complexity

Large — Multi-phase migration involving Helm chart changes, new K8s resources, coexistence testing, and documentation.

Dependencies

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions