Lesson 13 of 28

Module 4 · Concepts — State, providers, and why Terraform wins

doc

Checking sign-in…

What Terraform actually is

Terraform is a declarative provisioner. You describe the resources you want — "one VPC, three subnets, a GKE cluster with these settings" — and Terraform figures out the API calls to create, update, or delete to reach that state. It's the standard for infrastructure-as-code across every major cloud.

Three ideas carry the whole tool:

Providers — plugins that know how to talk to a specific API: AWS, GCP, Azure, Cloudflare, GitHub, Kubernetes, even kind. A Terraform config is a composition of providers.
State — a JSON file (terraform.tfstate) that records which resources Terraform thinks it's managing. Every plan and apply compares your config to state, not to reality. This is the source of most Terraform weirdness you'll hit in production — desynced state is a common cause of outages.
Plan and apply — plan shows the diff between current state and desired state; apply executes the plan. Never skip the plan step. Production outages have been caused by apply-without-plan.

You'll also encounter OpenTofu, a drop-in community fork after HashiCorp changed Terraform's license. For everything in this module, terraform and tofu are interchangeable.

The HCL vocabulary

Terraform's config language is HCL (HashiCorp Configuration Language). Four block types cover 90% of what you'll read and write:

provider — configure a plugin (which region, which credentials).
resource — declare a thing to create: resource "google_container_cluster" "main" { ... }.
variable — parameterise your config (region, cluster name, size).
output — surface values from state (cluster endpoint, kubeconfig).

You'll use all four in the task.

Why people use Terraform over the cloud console

Reproducibility — a config + state can be destroyed and rebuilt identically. Clickops in the console can't.
Review — infrastructure changes go through PRs. Someone sees the plan diff before it lands.
Drift detection — terraform plan tells you if someone changed things out-of-band.
Multi-cloud — the same mental model (providers, resources, state) works everywhere.

The cost: you have to write it. For one-off throwaway work, the console is faster. For anything you'll run in production, it's worth the upfront investment.

The state backend question

On your laptop, state lives in a local terraform.tfstate file. In production, do not do this — state has secrets (credentials, DB passwords) and shared access must be atomic. You move state to a remote backend with locking: an S3 bucket + DynamoDB table for AWS, a GCS bucket for GCP, or Terraform Cloud. The config looks like:

terraform {
  backend "gcs" {
    bucket = "my-tf-state"
    prefix = "prod/gke"
  }
}

For this module you'll use local state because you're provisioning a local kind cluster. For production work, always remote.

What you'll build in the task

A Terraform config that:

Provisions a kind cluster
Generates a kubeconfig you can use with kubectl
Outputs the cluster's API endpoint

Then you'll extend the module 2 Helm chart (or module 3 pipeline) to deploy against that cluster.

The bonus section at the end swaps the kind provider for google to provision a real GKE Autopilot cluster — same Terraform skills, real cloud bill. Keep your wallet in mind if you try it.

Relevant links

Terraform language docs — HCL syntax reference.
Kubernetes provider (terraform-provider-kubernetes) — for deploying manifests from Terraform (useful but often overkill; pipelines usually keep deploy in Helm).
kind provider — the Terraform provider for kind you'll use in the task.

View source documentation →