Lesson 3 of 28

Module 1 · Skill — Codify `kind-cluster-bootstrap`

doc

Checking sign-in…

Why this becomes a skill

You just stood up a Kubernetes cluster and deployed a service to it. You'll do exactly this — probably with a different image and different port mappings — dozens of times in a platform engineering career. The procedure is worth capturing so a future-you (or Claude, in a fresh repo with no history of this session) can repeat it in a minute instead of an afternoon.

A skill is valuable because you understand the procedure. You just read every YAML and asked questions until each field made sense. That is what the skill encodes. If Claude typed it and you didn't read it, the skill is worthless — it would just propagate your confusion.

Codify

Open the Claude Code session you used in the last lesson and send:

"Codify the session we just had as a skill at .claude/skills/kind-cluster-bootstrap/SKILL.md. Include (1) a one-line description, (2) a When to use this skill trigger, (3) the step sequence: create a kind cluster config with a host-port mapping, create a cluster, create a namespace, write a Deployment + Service manifest for a stateless HTTP service, apply, wait Ready, curl the healthz. Parameterise: cluster name, namespace, image, container port, host port, replica count. Keep it tight — under 120 lines."

Now open .claude/skills/kind-cluster-bootstrap/SKILL.md and read it end-to-end. If any step is opaque (e.g., "why is wait_for_ready necessary?"), that step isn't ready. Ask Claude to expand or re-word until every line tells you what it does and why. A skill you can't read is a skill that will break in unpredictable ways when you run it later.

Refine

Send Claude these three specific refinements (not vague "make it better"):

"Add a preflight block to the skill that checks docker ps, kubectl version --client, and kind version before doing anything, and fails with a clear message if any are missing."
"Remember the failure we hit in the Break it on purpose probe — where the Service targetPort and the container containerPort drifted and traffic silently went nowhere. Add a post-apply sanity check that diffs the Deployment's containerPort against the Service's targetPort and fails loudly if they don't match."
"Name the three most likely failure modes in a Known failures and how to diagnose section: (a) cluster-create fails because Docker isn't running, (b) pods are Ready but curl fails because of port-mapping drift, (c) kind create cluster fails because a cluster with the same name already exists. For each, say what the user should check."

Read the resulting SKILL.md again. If Claude's rewrite lost something important, ask it to put it back — don't accept drift.

Validate in a fresh context — happy path AND adversarial

This is the highest-value step. A skill that only works because it's in the session that built it is not a skill.

4a — Happy path

Open a new Claude Code session with no prior conversation history.
Pass it the skill: "Read .claude/skills/kind-cluster-bootstrap/SKILL.md and use it to deploy nginx (image nginx:1.27-alpine, container port 80) to a fresh cluster named 'smoketest' in namespace 'smoke', host port 8080."
Watch. A working skill produces: a cluster, a namespace, a Deployment + Service, a curl that returns nginx's welcome page — without you intervening.
Tear it down: kind delete cluster --name smoketest.

If the skill fails, the fix goes back in section 3 (Refine), not here. Re-validate afterwards.

4b — Adversarial

In the same fresh session (or another new one), feed the skill deliberately wrong inputs:

Port mismatch: ask the skill to deploy podinfo with container port: 9898 but in a cluster whose kind-config forwards to nodePort: 31080 instead of 30080. Does the skill notice the host-port mapping won't work before applying, or does it silently apply and leave you with broken traffic?
Name collision: ask the skill to create a cluster named smoketest while one already exists with that name. Does it fail loudly with a clear message, or does it run partial commands and leave junk?
Missing tool: temporarily rename your kind binary (mv $(which kind) /tmp/kind-hidden). Ask the skill to create a cluster. Does the preflight catch this and tell you what's missing, or does it run half the commands before failing confusingly? Restore the binary after.

Silent wrongness on any of these is a bug. Fix it in section 3 and re-run this section. A skill that only works under perfect conditions is a toy.

Promote deterministic commands to `scripts/`

Look over the SKILL.md. Some steps are judgement ("check whether the image is in a private registry and if so..."). Others are deterministic and should run the same way every time: the kind create cluster --config=... invocation, the kubectl wait / kubectl get pods polling, the curl-loop healthcheck.

Send Claude:

"Extract the deterministic commands from this skill into .claude/skills/kind-cluster-bootstrap/scripts/bootstrap.sh (accepts arguments: cluster name, namespace, image, container port, host port, replicas). Update SKILL.md to invoke the script for those steps, leaving prose only where judgement is needed (preflight interpretation, choosing parameters, interpreting a failed healthcheck)."

Read the generated bootstrap.sh. Make sure it set -euo pipefail (fails fast on error) and that every argument is quoted. Prose for judgement, bash for determinism.

Know the boundary

At the top of .claude/skills/kind-cluster-bootstrap/SKILL.md, ensure two sentences exist (edit them by hand if Claude's draft is vague):

This skill handles: stateless HTTP services, single-cluster local development with kind, parameterised image/ports/replicas, preflight for docker, kubectl, and kind, and post-apply sanity checks for port-mapping drift.
This skill does NOT handle: stateful workloads (no PVC provisioning), Ingress / TLS / external DNS, cloud clusters (GKE/EKS/AKS — see terraform-kind-provision for provisioning patterns, module 7 for IAM), RBAC / ServiceAccount creation beyond the default SA (see rbac-iam-scaffold in module 7), or any service that needs to talk to the Kubernetes API or a cloud SDK (IAM required).

A SKILL.md without a handles / does NOT handle pair is not done. The boundary statement is what keeps a fresh session from confidently mis-applying the skill to a case it wasn't built for.

You're done when

A fresh Claude Code session, given only this skill, (a) deploys nginx to a fresh cluster from one prompt, and (b) fails loudly and informatively when you feed it the adversarial inputs from section 4b. If either half fails, the skill isn't ready — fix and re-validate.