Lesson 13 of 28
Module 4 · Concepts — State, providers, and why Terraform wins
What Terraform actually is
Terraform is a declarative provisioner. You describe the resources you want — "one VPC, three subnets, a GKE cluster with these settings" — and Terraform figures out the API calls to create, update, or delete to reach that state. It's the standard for infrastructure-as-code across every major cloud.
Three ideas carry the whole tool:
- Providers — plugins that know how to talk to a specific API: AWS, GCP, Azure, Cloudflare, GitHub, Kubernetes, even
kind. A Terraform config is a composition of providers. - State — a JSON file (
terraform.tfstate) that records which resources Terraform thinks it's managing. Everyplanandapplycompares your config to state, not to reality. This is the source of most Terraform weirdness you'll hit in production — desynced state is a common cause of outages. - Plan and apply —
planshows the diff between current state and desired state;applyexecutes the plan. Never skip the plan step. Production outages have been caused byapply-without-plan.
You'll also encounter OpenTofu, a drop-in community fork after HashiCorp changed Terraform's license. For everything in this module, terraform and tofu are interchangeable.
The HCL vocabulary
Terraform's config language is HCL (HashiCorp Configuration Language). Four block types cover 90% of what you'll read and write:
provider— configure a plugin (which region, which credentials).resource— declare a thing to create:resource "google_container_cluster" "main" { ... }.variable— parameterise your config (region, cluster name, size).output— surface values from state (cluster endpoint, kubeconfig).
You'll use all four in the task.
Why people use Terraform over the cloud console
- Reproducibility — a config + state can be destroyed and rebuilt identically. Clickops in the console can't.
- Review — infrastructure changes go through PRs. Someone sees the
plandiff before it lands. - Drift detection —
terraform plantells you if someone changed things out-of-band. - Multi-cloud — the same mental model (providers, resources, state) works everywhere.
The cost: you have to write it. For one-off throwaway work, the console is faster. For anything you'll run in production, it's worth the upfront investment.
The state backend question
On your laptop, state lives in a local terraform.tfstate file. In production, do not do this — state has secrets (credentials, DB passwords) and shared access must be atomic. You move state to a remote backend with locking: an S3 bucket + DynamoDB table for AWS, a GCS bucket for GCP, or Terraform Cloud. The config looks like:
terraform {
backend "gcs" {
bucket = "my-tf-state"
prefix = "prod/gke"
}
}
For this module you'll use local state because you're provisioning a local kind cluster. For production work, always remote.
What you'll build in the task
A Terraform config that:
- Provisions a
kindcluster - Generates a kubeconfig you can use with
kubectl - Outputs the cluster's API endpoint
Then you'll extend the module 2 Helm chart (or module 3 pipeline) to deploy against that cluster.
The bonus section at the end swaps the kind provider for google to provision a real GKE Autopilot cluster — same Terraform skills, real cloud bill. Keep your wallet in mind if you try it.
Relevant links
- Terraform language docs — HCL syntax reference.
- Kubernetes provider (
terraform-provider-kubernetes) — for deploying manifests from Terraform (useful but often overkill; pipelines usually keep deploy in Helm). kindprovider — the Terraform provider forkindyou'll use in the task.