Proxmox Kubernetes Cluster
Private HA K3s cluster on Proxmox, provisioned by Terraform, bootstrapped by Ansible, and reconciled by Flux.
Architecture
| Component | Current Baseline |
|---|---|
| Control plane | 3 Proxmox VMs, VMIDs 200-202, IPs 10.27.27.30-32, 2 vCPU / 4 GiB / 32 GiB |
| Workers | 5 Proxmox VMs, VMIDs 210-214, IPs 10.27.27.41-45, 4 vCPU / 8 GiB / 64 GiB |
| Kubernetes | K3s v1.34.6+k3s1, HA embedded etcd, kube-vip API VIP 10.27.27.40 |
| Proxmox | Node flex, template VMID 9000, datastore Flash, bridge vmbr0 |
| Storage | Raw-manifest nfs-subdir-external-provisioner, 10.27.27.239:/TheFlash/k8s-nfs, default StorageClass flash-nfs |
| GitOps | Flux source platform on branch main; apps Kustomization is intentionally suspended |
| Private access | Tailscale operator exposes Rancher, Grafana, and Prometheus; no public ingress baseline |
| Runtime secrets | Doppler service token bootstraps External Secrets Operator |
K3s is pinned because Rancher chart 2.13.3 requires Kubernetes <1.35.0-0.
Prerequisites
- Terraform
>= 1.0. - Ansible with Python
jinja2andpyyaml. kubectlfor local verification.- Proxmox API token for the
bpg/proxmoxprovider. - S3-compatible bucket for Terraform state, currently Backblaze B2.
- SSH key pair available to Terraform and Ansible, defaulting to
~/.ssh/infraand~/.ssh/infra.pub.
Expected Proxmox inputs:
| Setting | Value |
|---|---|
| Endpoint | https://100.105.0.115:8006/ |
| Node | flex |
| Clone source | Template VMID 9000 (ubuntu-2404-k8s-template) |
| Storage | Flash |
Local Setup
Create local variables from the example:
cp terraform.tfvars.example terraform.tfvars
Important defaults in terraform.tfvars.example:
proxmox_endpoint = "https://100.105.0.115:8006/"
proxmox_api_token_id = "terraform-prov@pve!k8s-cluster"
proxmox_api_token_secret = "your-proxmox-api-token-secret"
ssh_public_key = "~/.ssh/infra.pub"
ssh_private_key = "~/.ssh/infra"
s3_access_key = "your-backblaze-key-id"
s3_secret_key = "your-backblaze-application-key"
s3_endpoint = "https://s3.eu-central-003.backblazeb2.com"
s3_bucket = "k8s-terraform-state"
tailscale_tailnet = "yourtailnet.ts.net"
kube_api_vip = "10.27.27.40"
Initialize Terraform with backend credentials:
terraform -chdir=terraform init \
-backend-config="endpoint=<s3-endpoint>" \
-backend-config="bucket=<s3-bucket>" \
-backend-config="region=auto" \
-backend-config="access_key=<s3-access-key>" \
-backend-config="secret_key=<s3-secret-key>" \
-backend-config="skip_requesting_account_id=true"
Common Commands
Terraform:
terraform -chdir=terraform fmt -recursive
terraform -chdir=terraform validate
terraform -chdir=terraform plan -var-file=../terraform.tfvars
terraform -chdir=terraform apply -var-file=../terraform.tfvars
Ansible setup:
ansible-galaxy collection install -r ansible/requirements.yml
cd ansible
python3 generate_inventory.py
ansible-playbook site.yml --syntax-check
Manual Ansible bootstrap uses the same extra vars as the deploy workflow:
cd ansible
ansible-playbook site.yml \
-e "tailscale_auth_key=$TAILSCALE_AUTH_KEY" \
-e "tailscale_tailnet=$TAILSCALE_TAILNET" \
-e "tailscale_oauth_client_id=$TAILSCALE_OAUTH_CLIENT_ID" \
-e "tailscale_oauth_client_secret=$TAILSCALE_OAUTH_CLIENT_SECRET" \
-e "doppler_hetznerterra_service_token=$DOPPLER_HETZNERTERRA_SERVICE_TOKEN" \
-e "tailscale_api_key=${TAILSCALE_API_KEY:-}" \
-e "grafana_admin_password=${GRAFANA_ADMIN_PASSWORD:-}" \
-e "cluster_name=k8s-cluster"
Flux/Kustomize verification:
kubectl kustomize infrastructure/addons/<addon>
kubectl kustomize infrastructure/addons
kubectl kustomize clusters/prod/flux-system
Refresh kubeconfig after rebuilds:
scripts/refresh-kubeconfig.sh 10.27.27.30
export KUBECONFIG=$(pwd)/outputs/kubeconfig
kubectl get nodes
Run the tailnet smoke check from cp1:
ssh ubuntu@10.27.27.30 'bash -s' < scripts/smoke-check-tailnet-services.sh
Gitea CI/CD
The supported full rebuild path is the Gitea deploy workflow.
| Workflow | Trigger | Purpose |
|---|---|---|
.gitea/workflows/deploy.yml |
PR to main, push to main, manual dispatch |
PRs run Terraform plan; pushes run Terraform apply, Ansible bootstrap, Flux bootstrap, addon gates, health checks, and tailnet smoke checks |
.gitea/workflows/destroy.yml |
Manual dispatch with confirm: destroy |
Terraform destroy with retries; no Rancher backup gate |
.gitea/workflows/dashboards.yml |
Grafana content changes or manual dispatch | Fast Grafana datasource/dashboard update through ansible/dashboards.yml |
Deploy and destroy share concurrency.group: prod-cluster so they do not run at the same time.
Deploy sequence on push to main:
- Terraform fmt/init/validate/plan/apply.
- Cleanup/retry around known transient Proxmox clone and disk-update failures.
- Generate Ansible inventory from Terraform outputs.
- Prepare critical image archives with
skopeoon the runner. - Run
ansible/site.ymlto bootstrap nodes, K3s, kube-vip, prerequisite secrets, and kubeconfig. - Apply Flux CRDs/controllers and the
clusters/prod/flux-systemgraph. - Gate cert-manager, External Secrets, Tailscale, NFS, Rancher, and observability.
- Run post-deploy health checks and Tailscale service smoke checks.
Required Gitea secrets:
| Secret | Description |
|---|---|
PROXMOX_ENDPOINT |
Proxmox API endpoint, for example https://100.105.0.115:8006/ |
PROXMOX_API_TOKEN_ID |
Proxmox API token ID |
PROXMOX_API_TOKEN_SECRET |
Proxmox API token secret |
S3_ACCESS_KEY |
S3/Backblaze access key for Terraform state |
S3_SECRET_KEY |
S3/Backblaze secret key for Terraform state |
S3_ENDPOINT |
S3 endpoint, for example https://s3.eu-central-003.backblazeb2.com |
S3_BUCKET |
Terraform state bucket, for example k8s-terraform-state |
TAILSCALE_AUTH_KEY |
Tailscale auth key for node bootstrap |
TAILSCALE_TAILNET |
Tailnet domain, for example silverside-gopher.ts.net |
TAILSCALE_OAUTH_CLIENT_ID |
Tailscale OAuth client ID for the Kubernetes operator |
TAILSCALE_OAUTH_CLIENT_SECRET |
Tailscale OAuth client secret for the Kubernetes operator |
TAILSCALE_API_KEY |
Optional API key used to delete stale offline reserved devices before service proxies exist |
DOPPLER_HETZNERTERRA_SERVICE_TOKEN |
Doppler service token for runtime cluster secrets |
GRAFANA_ADMIN_PASSWORD |
Optional Grafana admin password |
SSH_PUBLIC_KEY |
SSH public key content |
SSH_PRIVATE_KEY |
SSH private key content |
GitOps Graph
Flux entrypoint:
clusters/prod/flux-system/
├── gotk-components.yaml
├── gitrepository-platform.yaml
├── kustomization-infrastructure.yaml
└── kustomization-apps.yaml # suspend: true
Active infrastructure addons from infrastructure/addons/kustomization.yaml:
addon-nfs-storageaddon-external-secretsaddon-cert-manageraddon-tailscale-operatoraddon-tailscale-proxyclasstraefikHelmRelease manifests applied directly by the top-level infrastructure Kustomizationaddon-observabilityaddon-observability-contentaddon-rancheraddon-rancher-config
Chart/source strategy:
- Vendored charts are intentional:
cert-manager,traefik,kube-prometheus-stack,tailscale-operator, andrancherlive underinfrastructure/charts/. - External Secrets, Loki, and Promtail use Flux
OCIRepositorysources. - NFS storage is raw Kubernetes manifests, not a Helm chart.
- Rancher backup/restore is not part of the current live graph.
Doppler bootstrap details:
ansible/roles/doppler-bootstrapcreates theexternal-secretsnamespace and the Doppler token secret only.- The deploy workflow creates
ClusterSecretStore/doppler-hetznerterraafter ESO CRDs and webhook endpoints exist. - The checked-in
infrastructure/addons/external-secrets/clustersecretstore-doppler-hetznerterra.yamlis not included by the addon kustomization.
Access URLs
| Service | URL |
|---|---|
| Rancher | https://rancher.silverside-gopher.ts.net/ |
| Grafana | http://grafana.silverside-gopher.ts.net/ |
| Prometheus | http://prometheus.silverside-gopher.ts.net:9090/ |
Fallback port-forward from a tailnet-connected machine:
export KUBECONFIG=$(pwd)/outputs/kubeconfig
kubectl -n observability port-forward svc/kube-prometheus-stack-grafana 3000:80
kubectl -n observability port-forward svc/kube-prometheus-stack-prometheus 9090:9090
Grafana user is admin; password comes from the GRAFANA_ADMIN_PASSWORD Doppler secret or the workflow-provided fallback.
Operations
Scale workers by updating terraform.tfvars counts, IP lists, and VMID lists together. If node names or VMIDs change, also update the hard-coded retry cleanup target map in .gitea/workflows/deploy.yml.
Upgrade K3s by changing the role defaults in ansible/roles/k3s-server/defaults/main.yml and ansible/roles/k3s-agent/defaults/main.yml. Check Rancher chart compatibility before moving to a Kubernetes minor outside <1.35.0-0.
Destroy through the Gitea Destroy workflow with confirm: destroy, or locally with:
terraform -chdir=terraform destroy -var-file=../terraform.tfvars
Troubleshooting
Check K3s from cp1:
ssh ubuntu@10.27.27.30 'sudo k3s kubectl get nodes -o wide'
ssh ubuntu@10.27.27.30 'sudo journalctl -u k3s -n 120 --no-pager'
Check Flux and Rancher:
kubectl -n flux-system get gitrepositories,kustomizations,helmreleases,ocirepositories
kubectl -n flux-system describe helmrelease rancher
kubectl -n cattle-system get pods,deploy -o wide
Check Tailscale services:
kubectl -n tailscale-system get pods
kubectl -n cattle-system get svc rancher-tailscale
kubectl -n observability get svc grafana-tailscale prometheus-tailscale
kubectl -n cattle-system describe svc rancher-tailscale | grep TailscaleProxyReady
kubectl -n observability describe svc grafana-tailscale | grep TailscaleProxyReady
kubectl -n observability describe svc prometheus-tailscale | grep TailscaleProxyReady
If local kubectl falls back to localhost:8080, refresh outputs/kubeconfig with scripts/refresh-kubeconfig.sh 10.27.27.30.
Security Notes
- Never commit
terraform.tfvars, kubeconfigs, private keys,outputs/, or real secret values. - Terraform/bootstrap/CI secrets stay in Gitea Actions secrets.
- Runtime cluster secrets are sourced from Doppler through External Secrets.
- This repo does not manage Proxmox/LAN firewalls or public ingress.
License
MIT