Files
HetznerTerra/README.md
MichaelFisher1997 1c39274df7
All checks were successful
Deploy Cluster / Terraform (push) Successful in 54s
Deploy Cluster / Ansible (push) Successful in 22m19s
feat: stabilize tailscale observability exposure with declarative proxy class
2026-03-04 01:37:00 +00:00

8.8 KiB

Hetzner Kubernetes Cluster

Production-ready Kubernetes cluster on Hetzner Cloud using Terraform and Ansible.

Architecture

Component Details
Control Plane 3x CX23 (HA)
Workers 4x CX33
Total Cost €28.93/mo
K8s k3s (latest, HA)
Addons Hetzner CCM + CSI + Prometheus + Grafana + Loki
Access SSH/API restricted to Tailnet
Bootstrap Terraform + Ansible

Cluster Resources

  • 22 vCPU total (6 CP + 16 workers)
  • 44 GB RAM total (12 CP + 32 workers)
  • 440 GB SSD storage
  • 140 TB bandwidth allocation

Prerequisites

1. Hetzner Cloud API Token

  1. Go to Hetzner Cloud Console
  2. Select your project (or create a new one)
  3. Navigate to SecurityAPI Tokens
  4. Click Generate API Token
  5. Set description: k8s-cluster-terraform
  6. Select permissions: Read & Write
  7. Click Generate API Token
  8. Copy the token immediately - it won't be shown again!

2. Backblaze B2 Bucket (for Terraform State)

  1. Go to Backblaze B2
  2. Click Create a Bucket
  3. Set bucket name: k8s-terraform-state (must be globally unique)
  4. Choose Private access
  5. Click Create Bucket
  6. Create application key:
    • Go to App KeysAdd a New Application Key
    • Name: terraform-state
    • Allow access to: k8s-terraform-state bucket only
    • Type: Read and Write
    • Copy keyID (access key) and applicationKey (secret key)
  7. Note your bucket's S3 endpoint (e.g., https://s3.eu-central-003.backblazeb2.com)

3. SSH Key Pair

ssh-keygen -t ed25519 -C "k8s@hetzner" -f ~/.ssh/hetzner_k8s

4. Local Tools

Setup

1. Clone Repository

git clone <your-gitea-repo>/HetznerTerra.git
cd HetznerTerra

2. Configure Variables

cp terraform.tfvars.example terraform.tfvars

Edit terraform.tfvars:

hcloud_token = "your-hetzner-api-token"

ssh_public_key  = "~/.ssh/hetzner_k8s.pub"
ssh_private_key = "~/.ssh/hetzner_k8s"

s3_access_key = "your-backblaze-key-id"
s3_secret_key = "your-backblaze-application-key"
s3_endpoint   = "https://s3.eu-central-003.backblazeb2.com"
s3_bucket     = "k8s-terraform-state"

tailscale_auth_key = "tskey-auth-..."
tailscale_tailnet  = "yourtailnet.ts.net"

restrict_api_ssh_to_tailnet = true
tailnet_cidr                = "100.64.0.0/10"
enable_nodeport_public      = false

allowed_ssh_ips = []
allowed_api_ips = []

3. Initialize Terraform

cd terraform

# Create backend config file (or use CLI args)
cat > backend.hcl << EOF
endpoint                    = "https://s3.eu-central-003.backblazeb2.com"
bucket                      = "k8s-terraform-state"
access_key                  = "your-backblaze-key-id"
secret_key                  = "your-backblaze-application-key"
skip_requesting_account_id  = true
EOF

terraform init -backend-config=backend.hcl

4. Plan and Apply

terraform plan -var-file=../terraform.tfvars
terraform apply -var-file=../terraform.tfvars

5. Generate Ansible Inventory

cd ../ansible
python3 generate_inventory.py

6. Bootstrap Cluster

ansible-playbook site.yml

7. Get Kubeconfig

export KUBECONFIG=$(pwd)/outputs/kubeconfig
kubectl get nodes

Kubeconfig endpoint is rewritten to the primary control-plane tailnet hostname (k8s-cluster-cp-1.<your-tailnet>).

Gitea CI/CD

This repository includes Gitea workflows for:

  • terraform-plan: Runs on PRs, shows planned changes
  • terraform-apply: Runs on main branch after merge
  • ansible-deploy: Runs after terraform apply

Required Gitea Secrets

Set these in your Gitea repository settings (SettingsSecretsActions):

Secret Description
HCLOUD_TOKEN Hetzner Cloud API token
S3_ACCESS_KEY Backblaze B2 keyID
S3_SECRET_KEY Backblaze B2 applicationKey
S3_ENDPOINT Backblaze S3 endpoint (e.g., https://s3.eu-central-003.backblazeb2.com)
S3_BUCKET S3 bucket name (e.g., k8s-terraform-state)
TAILSCALE_AUTH_KEY Tailscale auth key for node bootstrap
TAILSCALE_TAILNET Tailnet domain (e.g., yourtailnet.ts.net)
TAILSCALE_OAUTH_CLIENT_ID Tailscale OAuth client ID for Kubernetes Operator
TAILSCALE_OAUTH_CLIENT_SECRET Tailscale OAuth client secret for Kubernetes Operator
GRAFANA_ADMIN_PASSWORD Optional admin password for Grafana (auto-generated if unset)
RUNNER_ALLOWED_CIDRS Optional CIDR list for CI runner access if you choose to pass it via tfvars/secrets
SSH_PUBLIC_KEY SSH public key content
SSH_PRIVATE_KEY SSH private key content

Observability Stack

The Ansible playbook deploys a lightweight observability stack in the observability namespace:

  • kube-prometheus-stack (Prometheus + Grafana)
  • loki
  • promtail

Services are kept internal by default, with optional declarative Tailscale exposure when the Tailscale Kubernetes Operator is healthy.

Access Grafana and Prometheus

Preferred (when Tailscale Operator is healthy):

  • Grafana: http://grafana (or http://grafana.<your-tailnet>)
  • Prometheus: http://prometheus (or http://prometheus.<your-tailnet>)

Fallback (port-forward from a tailnet-connected machine):

Run from a tailnet-connected machine:

export KUBECONFIG=$(pwd)/outputs/kubeconfig

kubectl -n observability port-forward svc/kube-prometheus-stack-grafana 3000:80
kubectl -n observability port-forward svc/kube-prometheus-stack-prometheus 9090:9090

Then open:

Grafana user: admin Grafana password: value of GRAFANA_ADMIN_PASSWORD secret (or the generated value shown by Ansible output)

Verify Tailscale exposure

export KUBECONFIG=$(pwd)/outputs/kubeconfig

kubectl -n tailscale-system get pods
kubectl -n observability get svc kube-prometheus-stack-grafana kube-prometheus-stack-prometheus
kubectl -n observability describe svc kube-prometheus-stack-grafana | grep TailscaleProxyReady
kubectl -n observability describe svc kube-prometheus-stack-prometheus | grep TailscaleProxyReady

If TailscaleProxyReady=False, check:

kubectl -n tailscale-system logs deployment/operator --tail=100

Common cause: OAuth client missing tag/scopes permissions.

File Structure

.
├── terraform/
│   ├── main.tf
│   ├── variables.tf
│   ├── network.tf
│   ├── firewall.tf
│   ├── ssh.tf
│   ├── servers.tf
│   ├── outputs.tf
│   └── backend.tf
├── ansible/
│   ├── inventory.tmpl
│   ├── generate_inventory.py
│   ├── site.yml
│   ├── roles/
│   │   ├── common/
│   │   ├── k3s-server/
│   │   ├── k3s-agent/
│   │   ├── ccm/
│   │   ├── csi/
│   │   ├── tailscale-operator/
│   │   └── observability/
│   └── ansible.cfg
├── .gitea/
│   └── workflows/
│       ├── terraform.yml
│       └── ansible.yml
├── outputs/
├── terraform.tfvars.example
└── README.md

Firewall Rules

Port Source Purpose
22 Tailnet CIDR SSH
6443 Tailnet CIDR + internal Kubernetes API
41641/udp Any Tailscale WireGuard
9345 10.0.0.0/16 k3s Supervisor (HA join)
2379 10.0.0.0/16 etcd Client
2380 10.0.0.0/16 etcd Peer
8472 10.0.0.0/16 Flannel VXLAN
10250 10.0.0.0/16 Kubelet
30000-32767 Optional NodePorts (disabled by default)

Operations

Scale Workers

Edit terraform.tfvars:

worker_count = 5

Then:

terraform apply
ansible-playbook site.yml

Upgrade k3s

ansible-playbook site.yml -t upgrade

Destroy Cluster

terraform destroy

Troubleshooting

Check k3s Logs

ssh root@<control-plane-ip> journalctl -u k3s -f

Reset k3s

ansible-playbook site.yml -t reset

Costs Breakdown

Resource Quantity Unit Price Monthly
CX23 (Control Plane) 3 €2.99 €8.97
CX33 (Workers) 4 €4.99 €19.96
Backblaze B2 ~1 GB Free (first 10GB) €0.00
Total €28.93/mo

Security Notes

  • Control plane has HA (3 nodes, can survive 1 failure)
  • Consider adding Hetzner load balancer for API server
  • Rotate API tokens regularly
  • Use network policies in Kubernetes
  • Enable audit logging for production

License

MIT