learnclaude .dev
← Kubernetes Operators, Incident Response & AI

Lesson 2 of 3

What is a Kubernetes operator?

doc

The pattern in one sentence

A Kubernetes operator is a controller that manages a custom resource — code that watches the cluster, compares observed state to desired state, and reconciles the difference.

The two parts

  1. Custom Resource Definition (CRD) — the schema of a new object type (e.g. kind: InferenceDeployment). This extends the Kubernetes API.
  2. Controller — a long-running process that watches that object type and does whatever work is required to make the cluster match it.

The CRD is the noun; the controller is the verb.

Reconciliation

The controller's core loop:

for each change (create / update / delete):
  desired = spec of the custom resource
  observed = actual state of the cluster
  if observed != desired:
    take action to converge

This loop is idempotent — running it twice with the same inputs produces the same result. No surprises if it runs on startup, on a retry, or every 30 seconds.

Why it matters for this course

Anywhere there's operational knowledge that a human would otherwise carry in their head — when this alert fires, scale that deployment; when this metric drifts, roll this model; when this error appears, restart that pod — an operator can encode it. AI adds a second layer: the reconciler doesn't just execute a fixed playbook, it can decide based on context. That's what the rest of the course builds toward.

View source documentation →