Demystifying OPA: Using Open Policy Agent for Policy-as-Code in Kubernetes

Introduction

Kubernetes provides powerful orchestration capabilities, but with great power comes great responsibility. Without proper guardrails, developers can inadvertently deploy insecure configurations—containers running as root, missing resource limits, or exposed sensitive ports.

Open Policy Agent (OPA) is a general-purpose policy engine that enables you to enforce security and compliance policies across your cloud-native stack. In this post, we’ll explore how OPA works, why it’s essential for Kubernetes security, and walk through a practical example of enforcing resource limits using Rego, OPA’s declarative policy language.


What is Open Policy Agent (OPA)?

Open Policy Agent is an open-source, graduated CNCF project that decouples policy decision-making from policy enforcement. OPA provides a unified framework for defining and evaluating policies across different systems—Kubernetes, Terraform, microservices, CI/CD pipelines, and more.

Key Concepts

  • Policy: A set of rules written in Rego that define what is allowed or denied
  • Query: A request sent to OPA to evaluate whether an action complies with a policy
  • Decision: OPA’s response indicating whether the action is permitted, denied, or requires modification

Why OPA for Kubernetes?

Kubernetes admission controllers intercept API requests before resources are created. OPA integrates with Kubernetes as a validating admission webhook, allowing you to:

  • Reject non-compliant deployments before they reach the cluster
  • Mutate resources to apply default security settings automatically
  • Audit existing resources for policy violations

OPA vs. Other Policy Engines

Feature OPA Kyverno PodSecurityPolicy (Deprecated)
Language Rego (declarative) YAML-based policies YAML-based
Scope Multi-system (K8s, Terraform, etc.) Kubernetes-specific Kubernetes-specific
Mutation Support Yes (via Gatekeeper) Yes No
Learning Curve Moderate Low Low
Community Large (CNCF Graduated) Growing Deprecated in K8s 1.25

While Kyverno offers an easier learning curve for Kubernetes-only use cases, OPA’s flexibility makes it ideal for organizations standardizing policy enforcement across multiple platforms.


Architecture: OPA Gatekeeper in Kubernetes

OPA Gatekeeper is a Kubernetes-native integration of OPA that simplifies policy management using Custom Resource Definitions (CRDs).

Key Components

  1. ConstraintTemplates: Define reusable policy logic in Rego
  2. Constraints: Instantiate a ConstraintTemplate with specific parameters
  3. Config: Defines which Kubernetes resources to sync to OPA’s cache

How It Works

When a user attempts to create a Deployment, the API server forwards the request to Gatekeeper, which evaluates it against all active Constraints. If the resource violates a policy, the request is rejected with a detailed error message.


Installing OPA Gatekeeper

Prerequisites

  • A running Kubernetes cluster (v1.16+)
  • kubectl configured to access the cluster

Installation

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/release-3.14/deploy/gatekeeper.yaml

text

Verify installation:

kubectl get pods -n gatekeeper-system

text

You should see pods like gatekeeper-controller-manager running.


Writing Your First Rego Policy

Let’s create a policy that ensures all Kubernetes Deployments have CPU and memory resource limits defined. This prevents resource exhaustion and improves cluster stability.

Step 1: Create a ConstraintTemplate

The ConstraintTemplate defines the Rego logic.

apiVersion: templates.gatekeeper.sh/v1 kind: ConstraintTemplate metadata: name: k8srequiredresourcelimits annotations: description: “Requires containers to have CPU and memory limits defined” spec: crd: spec: names: kind: K8sRequiredResourceLimits validation: openAPIV3Schema: type: object properties: exemptImages: type: array items: type: string targets:

  • target: admission.k8s.gatekeeper.sh rego: | package k8srequiredresourcelimits

text # Violation occurs if any container lacks resource limits violation[{“msg”: msg}] { container := input.review.object.spec.template.spec.containers[_] not container.resources.limits.cpu msg := sprintf(“Container ‘%s’ must define CPU limits”, [container.name]) }

violation[{"msg": msg}] {
  container := input.review.object.spec.template.spec.containers[_]
  not container.resources.limits.memory
  msg := sprintf("Container '%s' must define memory limits", [container.name])
}

# Allow exemptions for specific images (e.g., monitoring agents)
violation[{"msg": msg}] {
  container := input.review.object.spec.template.spec.containers[_]
  not exempt_image(container.image)
  not container.resources.limits
  msg := sprintf("Container '%s' must define resource limits", [container.name])
}

exempt_image(image) {
  exempt := input.parameters.exemptImages[_]
  startswith(image, exempt)
} text

Rego Explanation

  • violation[{"msg": msg}]: Defines a rule that produces a violation message if the condition is met
  • input.review.object: Contains the Kubernetes resource being evaluated
  • container := input.review.object.spec.template.spec.containers[_]: Iterates over all containers in the Deployment
  • not container.resources.limits.cpu: Checks if CPU limits are missing
  • exempt_image(image): Allows certain images (e.g., Prometheus exporters) to bypass the policy

Step 2: Apply the ConstraintTemplate

kubectl apply -f constraint-template.yaml

text

Step 3: Create a Constraint

The Constraint instantiates the ConstraintTemplate and defines enforcement scope.

apiVersion: constraints.gatekeeper.sh/v1beta1 kind: K8sRequiredResourceLimits metadata: name: require-deployment-limits spec: match: kinds:

  • apiGroups: [“apps”] kinds: [“Deployment”] namespaces:
  • “production”
  • “staging” parameters: exemptImages:
  • “prometheus/node-exporter”
  • “fluent/fluent-bit”

text

This constraint:

  • Applies only to Deployments in production and staging namespaces
  • Exempts monitoring containers from the policy

kubectl apply -f constraint.yaml

text


Testing the Policy

Deploying a Non-Compliant Resource

apiVersion: apps/v1 kind: Deployment metadata: name: bad-deployment namespace: production spec: replicas: 2 selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers:

  • name: web image: nginx:latest

    Missing resources.limits

text

Attempting to deploy this will result in:

kubectl apply -f bad-deployment.yaml

Error from server ([require-deployment-limits] Container ‘web’ must define CPU limits [require-deployment-limits] Container ‘web’ must define memory limits): error when creating “bad-deployment.yaml”: admission webhook “validation.gatekeeper.sh” denied the request

text

Deploying a Compliant Resource

apiVersion: apps/v1 kind: Deployment metadata: name: good-deployment namespace: production spec: replicas: 2 selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers:

  • name: web image: nginx:latest resources: limits: cpu: “500m” memory: “512Mi” requests: cpu: “250m” memory: “256Mi”

text

This deployment succeeds because resource limits are defined.


Auditing Existing Resources

Gatekeeper doesn’t just block new deployments—it can also audit existing resources for policy violations.

Enable Audit

Audit results are stored in the Constraint’s status field:

kubectl get K8sRequiredResourceLimits require-deployment-limits -o yaml

text

Example output:

status: auditTimestamp: “2025-10-15T10:30:00Z” violations:

  • enforcementAction: deny kind: Deployment message: “Container ‘web’ must define CPU limits” name: legacy-app namespace: production

text

This allows security teams to identify non-compliant resources without disrupting operations.


Advanced OPA Use Cases

1. Enforcing Image Registry Restrictions

Prevent deployment of images from untrusted registries:

package k8sallowedrepos

violation[{“msg”: msg}] { container := input.review.object.spec.template.spec.containers[_] not startswith(container.image, “gcr.io/my-company/”) msg := sprintf(“Image ‘%s’ is not from an approved registry”, [container.image]) }

text

2. Requiring Labels for Cost Allocation

Enforce that all Deployments have a cost-center label:

package k8srequiredlabels

violation[{“msg”: msg}] { not input.review.object.metadata.labels[“cost-center”] msg := “All Deployments must have a ‘cost-center’ label” }

text

3. Preventing Privileged Containers

Block containers running with privileged: true:

package k8spsprivileged

violation[{“msg”: msg}] { container := input.review.object.spec.template.spec.containers[_] container.securityContext.privileged == true msg := sprintf(“Container ‘%s’ cannot run in privileged mode”, [container.name]) }

text


Best Practices for OPA in Production

  1. Start with Audit Mode: Use enforcementAction: dryrun to observe violations without blocking deployments
  2. Version Control Policies: Store ConstraintTemplates and Constraints in Git for review and rollback
  3. Test Policies in Non-Production: Validate new policies in dev/staging before applying to production
  4. Monitor Policy Performance: OPA Gatekeeper exposes Prometheus metrics for policy evaluation latency
  5. Provide Clear Error Messages: Developers should understand why their deployment was rejected and how to fix it

Conclusion

Open Policy Agent transforms Kubernetes security from reactive to proactive. By codifying security requirements as Rego policies, you ensure that compliance is automated, auditable, and enforced consistently across all clusters.

Whether you’re preventing resource exhaustion, enforcing image registry restrictions, or requiring security labels, OPA Gatekeeper provides a flexible, scalable solution for Policy-as-Code in Kubernetes.

Security policies shouldn’t be documents gathering dust—they should be code that actively protects your infrastructure.


Resources