# Proxmox Kubernetes Cluster Production-ready private Kubernetes cluster on Proxmox using Terraform, Ansible, and Flux. ## Architecture | Component | Details | |-----------|---------| | **Control Plane** | 3x Proxmox VMs (2 vCPU / 4 GiB / 32 GiB) | | **Workers** | 5x Proxmox VMs (4 vCPU / 8 GiB / 64 GiB) | | **K8s** | k3s (latest, HA) | | **Addons** | NFS provisioner + Prometheus + Grafana + Loki + Rancher | | **Access** | SSH/API and private services restricted to Tailnet | | **Bootstrap** | Terraform + Ansible + Flux | ## Prerequisites ### 1. Proxmox API Token Create an API token for the Proxmox VE user used by Terraform. The repo expects the `bpg/proxmox` provider with: - endpoint: `https://100.105.0.115:8006/` - node: `flex` - clone source: template `9000` (`ubuntu-2404-k8s-template`) - auth: API token ### 2. Backblaze B2 Bucket (for Terraform State) 1. Go to [Backblaze B2](https://secure.backblaze.com/b2_buckets.htm) 2. Click **Create a Bucket** 3. Set bucket name: `k8s-terraform-state` (must be globally unique) 4. Choose **Private** access 5. Click **Create Bucket** 6. Create application key: - Go to **App Keys** → **Add a New Application Key** - Name: `terraform-state` - Allow access to: `k8s-terraform-state` bucket only - Type: **Read and Write** - Copy **keyID** (access key) and **applicationKey** (secret key) 7. Note your bucket's S3 endpoint (e.g., `https://s3.eu-central-003.backblazeb2.com`) ### 3. SSH Key Pair ```bash ssh-keygen -t ed25519 -C "k8s@proxmox" -f ~/.ssh/infra ``` ### 4. Local Tools - [Terraform](https://terraform.io/downloads) >= 1.0 - [Ansible](https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html) >= 2.9 - Python 3 with `jinja2` and `pyyaml` ## Setup ### 1. Clone Repository ```bash git clone /HetznerTerra.git cd HetznerTerra ``` ### 2. Configure Variables ```bash cp terraform.tfvars.example terraform.tfvars ``` Edit `terraform.tfvars`: ```hcl proxmox_endpoint = "https://100.105.0.115:8006/" proxmox_api_token_id = "terraform-prov@pve!k8s-cluster" proxmox_api_token_secret = "your-proxmox-token-secret" ssh_public_key = "~/.ssh/infra.pub" ssh_private_key = "~/.ssh/infra" s3_access_key = "your-backblaze-key-id" s3_secret_key = "your-backblaze-application-key" s3_endpoint = "https://s3.eu-central-003.backblazeb2.com" s3_bucket = "k8s-terraform-state" tailscale_auth_key = "tskey-auth-..." tailscale_tailnet = "yourtailnet.ts.net" kube_api_vip = "10.27.27.40" ``` ### 3. Initialize Terraform ```bash cd terraform # Create backend config file (or use CLI args) cat > backend.hcl << EOF endpoint = "https://s3.eu-central-003.backblazeb2.com" bucket = "k8s-terraform-state" access_key = "your-backblaze-key-id" secret_key = "your-backblaze-application-key" skip_requesting_account_id = true EOF terraform init -backend-config=backend.hcl ``` ### 4. Plan and Apply ```bash terraform plan -var-file=../terraform.tfvars terraform apply -var-file=../terraform.tfvars ``` ### 5. Generate Ansible Inventory ```bash cd ../ansible python3 generate_inventory.py ``` ### 6. Bootstrap Cluster ```bash ansible-playbook site.yml ``` ### 7. Get Kubeconfig ```bash export KUBECONFIG=$(pwd)/outputs/kubeconfig kubectl get nodes ``` Use `scripts/refresh-kubeconfig.sh ` to refresh kubeconfig against the primary control-plane public IP after rebuilds. ## Gitea CI/CD This repository includes Gitea workflows for: - **deploy**: End-to-end Terraform + Ansible + Flux bootstrap + restore + health checks - **destroy**: Cluster teardown with backup-aware cleanup - **dashboards**: Fast workflow that updates Grafana datasources/dashboards only ### Required Gitea Secrets Set these in your Gitea repository settings (**Settings** → **Secrets** → **Actions**): | Secret | Description | |--------|-------------| | `PROXMOX_ENDPOINT` | Proxmox API endpoint (for example `https://100.105.0.115:8006/`) | | `PROXMOX_API_TOKEN_ID` | Proxmox API token ID | | `PROXMOX_API_TOKEN_SECRET` | Proxmox API token secret | | `S3_ACCESS_KEY` | Backblaze B2 keyID | | `S3_SECRET_KEY` | Backblaze B2 applicationKey | | `S3_ENDPOINT` | Backblaze S3 endpoint (e.g., `https://s3.eu-central-003.backblazeb2.com`) | | `S3_BUCKET` | S3 bucket name (e.g., `k8s-terraform-state`) | | `TAILSCALE_AUTH_KEY` | Tailscale auth key for node bootstrap | | `TAILSCALE_TAILNET` | Tailnet domain (e.g., `yourtailnet.ts.net`) | | `TAILSCALE_OAUTH_CLIENT_ID` | Tailscale OAuth client ID for Kubernetes Operator | | `TAILSCALE_OAUTH_CLIENT_SECRET` | Tailscale OAuth client secret for Kubernetes Operator | | `DOPPLER_HETZNERTERRA_SERVICE_TOKEN` | Doppler service token for `hetznerterra` runtime secrets | | `GRAFANA_ADMIN_PASSWORD` | Optional admin password for Grafana (auto-generated if unset) | | `SSH_PUBLIC_KEY` | SSH public key content | | `SSH_PRIVATE_KEY` | SSH private key content | ## GitOps (Flux) This repo uses Flux for continuous reconciliation after Terraform + Ansible bootstrap. ### Stable private-only baseline The current default target is the HA private baseline: - `3` control plane nodes - `5` worker nodes - private Proxmox network only - Tailscale for operator and service access - Flux-managed platform addons with `apps` suspended by default Detailed phase gates and success criteria live in `STABLE_BASELINE.md`. This is the default until rebuilds are consistently green. High availability, public ingress, and app-layer expansion come later. ### Runtime secrets Runtime cluster secrets are moving to Doppler + External Secrets Operator. - Doppler project: `hetznerterra` - Initial auth: service token via `DOPPLER_HETZNERTERRA_SERVICE_TOKEN` - First synced secrets: - `GRAFANA_ADMIN_PASSWORD` Terraform/bootstrap secrets remain in Gitea Actions secrets and are not managed by Doppler. ### Repository layout - `clusters/prod/`: cluster entrypoint and Flux reconciliation objects - `clusters/prod/flux-system/`: `GitRepository` source and top-level `Kustomization` graph - `infrastructure/`: infrastructure addon reconciliation graph - `infrastructure/addons/*`: per-addon manifests for Flux-managed cluster addons - `apps/`: application workload layer (currently scaffolded) ### Reconciliation graph - `infrastructure` (top-level) - `addon-nfs-storage` - `addon-tailscale-operator` - `addon-observability` - `addon-observability-content` depends on `addon-observability` - `apps` depends on `infrastructure` ### Bootstrap notes 1. Install Flux controllers in `flux-system`. 2. Create the Flux deploy key/secret named `flux-system` in `flux-system` namespace. 3. Apply `clusters/prod/flux-system/` once to establish source + reconciliation graph. 4. Bootstrap-only Ansible creates prerequisite secrets; Flux manages addon lifecycle after bootstrap. ### Current addon status - Core infrastructure addons are Flux-managed from `infrastructure/addons/`. - Active Flux addons for the current baseline: `addon-nfs-storage`, `addon-cert-manager`, `addon-external-secrets`, `addon-tailscale-operator`, `addon-tailscale-proxyclass`, `addon-observability`, `addon-observability-content`, `addon-rancher`, `addon-rancher-config`, `addon-rancher-backup`, `addon-rancher-backup-config`. - `apps` remains suspended until workload rollout is explicitly enabled. - Ansible is limited to cluster bootstrap, prerequisite secret creation, pre-proxy Tailscale cleanup, and kubeconfig finalization. - Weave GitOps / Flux UI is no longer deployed; use Rancher or the `flux` CLI for Flux operations. ### Rancher access - Rancher is private-only and exposed through Tailscale at `https://rancher.silverside-gopher.ts.net/`. - Rancher and the Kubernetes API stay private; kube-vip provides the API VIP on the LAN. - Rancher stores state in embedded etcd; no external database is used. ### Stable baseline acceptance A rebuild is considered successful only when all of the following pass without manual intervention: - Terraform create succeeds for the default `3` control planes and `5` workers. - Ansible bootstrap succeeds end-to-end. - All nodes become `Ready`. - Flux core reconciliation is healthy. - External Secrets Operator is ready. - Tailscale operator is ready. - Tailnet smoke checks pass for Rancher, Grafana, and Prometheus. - Terraform destroy succeeds cleanly or succeeds after workflow retries. ## Observability Stack Flux deploys a lightweight observability stack in the `observability` namespace: - `kube-prometheus-stack` (Prometheus + Grafana) - `loki` - `promtail` Grafana content is managed as code via ConfigMaps in `infrastructure/addons/observability-content/`. Grafana and Prometheus are exposed through dedicated Tailscale LoadBalancer services when the Tailscale Kubernetes Operator is healthy. ### Access Grafana and Prometheus Preferred private access: - Grafana: `http://grafana.silverside-gopher.ts.net/` - Prometheus: `http://prometheus.silverside-gopher.ts.net:9090/` Fallback (port-forward from a tailnet-connected machine): Run from a tailnet-connected machine: ```bash export KUBECONFIG=$(pwd)/outputs/kubeconfig kubectl -n observability port-forward svc/kube-prometheus-stack-grafana 3000:80 kubectl -n observability port-forward svc/kube-prometheus-stack-prometheus 9090:9090 ``` Then open: - Grafana: http://127.0.0.1:3000 - Prometheus: http://127.0.0.1:9090 Grafana user: `admin` Grafana password: value of `GRAFANA_ADMIN_PASSWORD` secret (or the generated value shown by Ansible output) ### Verify Tailscale exposure ```bash export KUBECONFIG=$(pwd)/outputs/kubeconfig kubectl -n tailscale-system get pods kubectl -n cattle-system get svc rancher-tailscale kubectl -n observability get svc grafana-tailscale prometheus-tailscale kubectl -n cattle-system describe svc rancher-tailscale | grep TailscaleProxyReady kubectl -n observability describe svc grafana-tailscale | grep TailscaleProxyReady kubectl -n observability describe svc prometheus-tailscale | grep TailscaleProxyReady ``` If `TailscaleProxyReady=False`, check: ```bash kubectl -n tailscale-system logs deployment/operator --tail=100 ``` Common cause: OAuth client missing tag/scopes permissions. ### Fast dashboard iteration workflow Use the `Deploy Grafana Content` workflow when changing dashboard/data source templates. It avoids full cluster provisioning and only applies Grafana content resources: - `ansible/roles/observability-content/templates/grafana-datasources.yaml.j2` - `ansible/roles/observability-content/templates/grafana-dashboard-k8s-overview.yaml.j2` - `ansible/dashboards.yml` ## File Structure ``` . ├── terraform/ │ ├── main.tf │ ├── variables.tf │ ├── servers.tf │ ├── outputs.tf │ └── backend.tf ├── ansible/ │ ├── inventory.tmpl │ ├── generate_inventory.py │ ├── site.yml │ ├── roles/ │ │ ├── common/ │ │ ├── k3s-server/ │ │ ├── k3s-agent/ │ │ ├── addon-secrets-bootstrap/ │ │ ├── observability-content/ │ │ └── observability/ │ └── ansible.cfg ├── .gitea/ │ └── workflows/ │ ├── terraform.yml │ ├── ansible.yml │ └── dashboards.yml ├── outputs/ ├── terraform.tfvars.example └── README.md ``` ## Firewall Rules This repo no longer manages cloud firewalls. Access control is expected to be handled on your LAN infrastructure and through Tailscale. Important cluster-local ports still in use: | Port | Source | Purpose | |------|--------|---------| | 22 | Admin hosts / CI | SSH | | 6443 | 10.27.27.0/24 + VIP | Kubernetes API | | 9345 | 10.27.27.0/24 | k3s Supervisor | | 2379 | 10.27.27.0/24 | etcd Client | | 2380 | 10.27.27.0/24 | etcd Peer | | 8472/udp | 10.27.27.0/24 | Flannel VXLAN | | 10250 | 10.27.27.0/24 | Kubelet | ## Operations ### Scale Workers Edit `terraform.tfvars`: ```hcl worker_count = 5 ``` Then: ```bash terraform apply ansible-playbook site.yml ``` ### Upgrade k3s ```bash ansible-playbook site.yml -t upgrade ``` ### Destroy Cluster ```bash terraform destroy ``` ## Troubleshooting ### Check k3s Logs ```bash ssh ubuntu@ sudo journalctl -u k3s -f ``` ### Reset k3s ```bash ansible-playbook site.yml -t reset ``` ## Security Notes - Control plane has HA (3 nodes, can survive 1 failure) - Kubernetes API HA is provided by kube-vip on `10.27.27.40` - Rotate API tokens regularly - Use network policies in Kubernetes - Enable audit logging for production ## License MIT