# Hetzner Kubernetes Cluster Production-ready Kubernetes cluster on Hetzner Cloud using Terraform and Ansible. ## Architecture | Component | Details | |-----------|---------| | **Control Plane** | 3x CX23 (HA) | | **Workers** | 4x CX33 | | **Total Cost** | €28.93/mo | | **K8s** | k3s (latest, HA) | | **Addons** | Hetzner CCM + CSI + Prometheus + Grafana + Loki | | **Access** | SSH/API restricted to Tailnet | | **Bootstrap** | Terraform + Ansible | ### Cluster Resources - 22 vCPU total (6 CP + 16 workers) - 44 GB RAM total (12 CP + 32 workers) - 440 GB SSD storage - 140 TB bandwidth allocation ## Prerequisites ### 1. Hetzner Cloud API Token 1. Go to [Hetzner Cloud Console](https://console.hetzner.com/) 2. Select your project (or create a new one) 3. Navigate to **Security** → **API Tokens** 4. Click **Generate API Token** 5. Set description: `k8s-cluster-terraform` 6. Select permissions: **Read & Write** 7. Click **Generate API Token** 8. **Copy the token immediately** - it won't be shown again! ### 2. Backblaze B2 Bucket (for Terraform State) 1. Go to [Backblaze B2](https://secure.backblaze.com/b2_buckets.htm) 2. Click **Create a Bucket** 3. Set bucket name: `k8s-terraform-state` (must be globally unique) 4. Choose **Private** access 5. Click **Create Bucket** 6. Create application key: - Go to **App Keys** → **Add a New Application Key** - Name: `terraform-state` - Allow access to: `k8s-terraform-state` bucket only - Type: **Read and Write** - Copy **keyID** (access key) and **applicationKey** (secret key) 7. Note your bucket's S3 endpoint (e.g., `https://s3.eu-central-003.backblazeb2.com`) ### 3. SSH Key Pair ```bash ssh-keygen -t ed25519 -C "k8s@hetzner" -f ~/.ssh/hetzner_k8s ``` ### 4. Local Tools - [Terraform](https://terraform.io/downloads) >= 1.0 - [Ansible](https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html) >= 2.9 - Python 3 with `jinja2` and `pyyaml` ## Setup ### 1. Clone Repository ```bash git clone /HetznerTerra.git cd HetznerTerra ``` ### 2. Configure Variables ```bash cp terraform.tfvars.example terraform.tfvars ``` Edit `terraform.tfvars`: ```hcl hcloud_token = "your-hetzner-api-token" ssh_public_key = "~/.ssh/hetzner_k8s.pub" ssh_private_key = "~/.ssh/hetzner_k8s" s3_access_key = "your-backblaze-key-id" s3_secret_key = "your-backblaze-application-key" s3_endpoint = "https://s3.eu-central-003.backblazeb2.com" s3_bucket = "k8s-terraform-state" tailscale_auth_key = "tskey-auth-..." tailscale_tailnet = "yourtailnet.ts.net" restrict_api_ssh_to_tailnet = true tailnet_cidr = "100.64.0.0/10" enable_nodeport_public = false allowed_ssh_ips = [] allowed_api_ips = [] ``` ### 3. Initialize Terraform ```bash cd terraform # Create backend config file (or use CLI args) cat > backend.hcl << EOF endpoint = "https://s3.eu-central-003.backblazeb2.com" bucket = "k8s-terraform-state" access_key = "your-backblaze-key-id" secret_key = "your-backblaze-application-key" skip_requesting_account_id = true EOF terraform init -backend-config=backend.hcl ``` ### 4. Plan and Apply ```bash terraform plan -var-file=../terraform.tfvars terraform apply -var-file=../terraform.tfvars ``` ### 5. Generate Ansible Inventory ```bash cd ../ansible python3 generate_inventory.py ``` ### 6. Bootstrap Cluster ```bash ansible-playbook site.yml ``` ### 7. Get Kubeconfig ```bash export KUBECONFIG=$(pwd)/outputs/kubeconfig kubectl get nodes ``` Kubeconfig endpoint is rewritten to the primary control-plane tailnet hostname (`k8s-cluster-cp-1.`). ## Gitea CI/CD This repository includes Gitea workflows for: - **terraform-plan**: Runs on PRs, shows planned changes - **terraform-apply**: Runs on main branch after merge - **ansible-deploy**: Runs after terraform apply ### Required Gitea Secrets Set these in your Gitea repository settings (**Settings** → **Secrets** → **Actions**): | Secret | Description | |--------|-------------| | `HCLOUD_TOKEN` | Hetzner Cloud API token | | `S3_ACCESS_KEY` | Backblaze B2 keyID | | `S3_SECRET_KEY` | Backblaze B2 applicationKey | | `S3_ENDPOINT` | Backblaze S3 endpoint (e.g., `https://s3.eu-central-003.backblazeb2.com`) | | `S3_BUCKET` | S3 bucket name (e.g., `k8s-terraform-state`) | | `TAILSCALE_AUTH_KEY` | Tailscale auth key for node bootstrap | | `TAILSCALE_TAILNET` | Tailnet domain (e.g., `yourtailnet.ts.net`) | | `TAILSCALE_OAUTH_CLIENT_ID` | Tailscale OAuth client ID for Kubernetes Operator | | `TAILSCALE_OAUTH_CLIENT_SECRET` | Tailscale OAuth client secret for Kubernetes Operator | | `GRAFANA_ADMIN_PASSWORD` | Optional admin password for Grafana (auto-generated if unset) | | `RUNNER_ALLOWED_CIDRS` | Optional CIDR list for CI runner access if you choose to pass it via tfvars/secrets | | `SSH_PUBLIC_KEY` | SSH public key content | | `SSH_PRIVATE_KEY` | SSH private key content | ## Observability Stack The Ansible playbook deploys a lightweight observability stack in the `observability` namespace: - `kube-prometheus-stack` (Prometheus + Grafana) - `loki` - `promtail` Services are kept internal by default, with optional declarative Tailscale exposure when the Tailscale Kubernetes Operator is healthy. ### Access Grafana and Prometheus Preferred (when Tailscale Operator is healthy): - Grafana: `http://grafana` (or `http://grafana.`) - Prometheus: `http://prometheus` (or `http://prometheus.`) Fallback (port-forward from a tailnet-connected machine): Run from a tailnet-connected machine: ```bash export KUBECONFIG=$(pwd)/outputs/kubeconfig kubectl -n observability port-forward svc/kube-prometheus-stack-grafana 3000:80 kubectl -n observability port-forward svc/kube-prometheus-stack-prometheus 9090:9090 ``` Then open: - Grafana: http://127.0.0.1:3000 - Prometheus: http://127.0.0.1:9090 Grafana user: `admin` Grafana password: value of `GRAFANA_ADMIN_PASSWORD` secret (or the generated value shown by Ansible output) ### Verify Tailscale exposure ```bash export KUBECONFIG=$(pwd)/outputs/kubeconfig kubectl -n tailscale-system get pods kubectl -n observability get svc kube-prometheus-stack-grafana kube-prometheus-stack-prometheus kubectl -n observability describe svc kube-prometheus-stack-grafana | grep TailscaleProxyReady kubectl -n observability describe svc kube-prometheus-stack-prometheus | grep TailscaleProxyReady ``` If `TailscaleProxyReady=False`, check: ```bash kubectl -n tailscale-system logs deployment/operator --tail=100 ``` Common cause: OAuth client missing tag/scopes permissions. ## File Structure ``` . ├── terraform/ │ ├── main.tf │ ├── variables.tf │ ├── network.tf │ ├── firewall.tf │ ├── ssh.tf │ ├── servers.tf │ ├── outputs.tf │ └── backend.tf ├── ansible/ │ ├── inventory.tmpl │ ├── generate_inventory.py │ ├── site.yml │ ├── roles/ │ │ ├── common/ │ │ ├── k3s-server/ │ │ ├── k3s-agent/ │ │ ├── ccm/ │ │ ├── csi/ │ │ ├── tailscale-operator/ │ │ └── observability/ │ └── ansible.cfg ├── .gitea/ │ └── workflows/ │ ├── terraform.yml │ └── ansible.yml ├── outputs/ ├── terraform.tfvars.example └── README.md ``` ## Firewall Rules | Port | Source | Purpose | |------|--------|---------| | 22 | Tailnet CIDR | SSH | | 6443 | Tailnet CIDR + internal | Kubernetes API | | 41641/udp | Any | Tailscale WireGuard | | 9345 | 10.0.0.0/16 | k3s Supervisor (HA join) | | 2379 | 10.0.0.0/16 | etcd Client | | 2380 | 10.0.0.0/16 | etcd Peer | | 8472 | 10.0.0.0/16 | Flannel VXLAN | | 10250 | 10.0.0.0/16 | Kubelet | | 30000-32767 | Optional | NodePorts (disabled by default) | ## Operations ### Scale Workers Edit `terraform.tfvars`: ```hcl worker_count = 5 ``` Then: ```bash terraform apply ansible-playbook site.yml ``` ### Upgrade k3s ```bash ansible-playbook site.yml -t upgrade ``` ### Destroy Cluster ```bash terraform destroy ``` ## Troubleshooting ### Check k3s Logs ```bash ssh root@ journalctl -u k3s -f ``` ### Reset k3s ```bash ansible-playbook site.yml -t reset ``` ## Costs Breakdown | Resource | Quantity | Unit Price | Monthly | |----------|----------|------------|---------| | CX23 (Control Plane) | 3 | €2.99 | €8.97 | | CX33 (Workers) | 4 | €4.99 | €19.96 | | Backblaze B2 | ~1 GB | Free (first 10GB) | €0.00 | | **Total** | | | **€28.93/mo** | ## Security Notes - Control plane has HA (3 nodes, can survive 1 failure) - Consider adding Hetzner load balancer for API server - Rotate API tokens regularly - Use network policies in Kubernetes - Enable audit logging for production ## License MIT