Make Terraform the source of truth for node IPs, remove guest-agent/SSH discovery from the normal workflow path, simplify the bootstrap controller to a fresh-run flow, and swap the initial CNI to Flannel so cluster readiness is easier to prove before reintroducing more complex reconcile behavior.
4.7 KiB
Kubeadm Cluster Layout (NixOS)
This folder defines role-based NixOS configs for a kubeadm cluster.
Topology
- Control planes:
cp-1,cp-2,cp-3 - Workers:
wk-1,wk-2,wk-3
What this provides
- Shared Kubernetes/node prerequisites in
modules/k8s-common.nix - Shared cluster defaults in
modules/k8s-cluster-settings.nix - Role-specific settings for control planes and workers
- Generated per-node host configs from
flake.nix(no duplicated host files) - Bootstrap helper commands on each node:
th-kubeadm-initth-kubeadm-join-control-planeth-kubeadm-join-workerth-kubeadm-status
- A Python bootstrap controller for orchestration:
bootstrap/controller.py
Layered architecture
terraform/: VM lifecycle onlynixos/kubeadm/modules/: declarative node OS config onlynixos/kubeadm/bootstrap/controller.py: imperative cluster reconciliation state machine
Hardware config files
The flake automatically imports hosts/hardware/<host>.nix if present.
Copy each node's generated hardware config into this folder:
sudo nixos-generate-config
sudo cp /etc/nixos/hardware-configuration.nix ./hosts/hardware/cp-1.nix
Repeat for each node (cp-2, cp-3, wk-1, wk-2, wk-3).
Deploy approach
Start from one node at a time while experimenting:
sudo nixos-rebuild switch --flake .#cp-1
For remote target-host workflows, use your preferred deploy wrapper later
(nixos-rebuild --target-host ... or deploy-rs/colmena).
Bootstrap runbook (kubeadm + kube-vip + Flannel)
- Apply Nix config on all nodes (
cp-*, thenwk-*). - On
cp-1, run:
sudo th-kubeadm-init
This infers the control-plane VIP as <node-subnet>.250 on eth0, creates the
kube-vip static pod manifest, and runs kubeadm init.
- Install Flannel from
cp-1:
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.25.5/Documentation/kube-flannel.yml
- Generate join commands on
cp-1:
sudo kubeadm token create --print-join-command
sudo kubeadm init phase upload-certs --upload-certs
- Join
cp-2andcp-3:
sudo th-kubeadm-join-control-plane '<kubeadm join ... --control-plane --certificate-key ...>'
- Join workers:
sudo th-kubeadm-join-worker '<kubeadm join ...>'
- Validate from a control plane:
kubectl get nodes -o wide
kubectl -n kube-system get pods -o wide
Fresh bootstrap flow (recommended)
- Copy and edit inventory:
cp ./scripts/inventory.example.env ./scripts/inventory.env
$EDITOR ./scripts/inventory.env
- Rebuild all nodes and bootstrap a fresh cluster:
./scripts/rebuild-and-bootstrap.sh
Optional tuning env vars:
FAST_MODE=1 WORKER_PARALLELISM=3 REBUILD_TIMEOUT=45m REBUILD_RETRIES=2 ./scripts/rebuild-and-bootstrap.sh
FAST_MODE=1skips pre-rebuild remote GC cleanup to reduce wall-clock time.- Set
FAST_MODE=0for a slower but more aggressive space cleanup pass.
Bootstrap controller state
The controller stores checkpoints in both places:
- Remote (source of truth):
/var/lib/terrahome/bootstrap-state.jsononcp-1 - Local copy (workflow/debug artifact):
nixos/kubeadm/bootstrap/bootstrap-state-last.json
This makes retries resumable and keeps failure context visible from CI.
- If you only want to reset Kubernetes state on existing VMs:
./scripts/reset-cluster-nodes.sh
For a full nuke/recreate lifecycle:
- run Terraform destroy/apply for VMs first,
- then run
./scripts/rebuild-and-bootstrap.shagain.
Node lists now come directly from static Terraform outputs, so bootstrap no longer depends on Proxmox guest-agent IP discovery or SSH subnet scanning.
Optional Gitea workflow automation
Primary flow:
- Push to
mastertriggers.gitea/workflows/terraform-apply.yml - That workflow now does Terraform apply and then runs a fresh kubeadm bootstrap automatically
Manual dispatch workflows are available:
.gitea/workflows/kubeadm-bootstrap.yml.gitea/workflows/kubeadm-reset.yml
Required repository secrets:
- Existing Terraform/backend secrets used by current workflows (
B2_*,PM_API_TOKEN_SECRET,SSH_KEY_PUBLIC) - SSH private key: prefer
KUBEADM_SSH_PRIVATE_KEY, fallback to existingSSH_KEY_PRIVATE
Optional secrets:
KUBEADM_SSH_USER(defaults tomicqdf) Node IPs are rendered directly from static Terraform outputs (control_plane_vm_ipv4,worker_vm_ipv4), so you do not need per-node IP secrets or SSH discovery fallbacks.
Notes
- Scripts are intentionally manual-triggered (predictable for homelab bring-up).
- If
.250on the node subnet is already in use, changecontrolPlaneVipSuffixinmodules/k8s-cluster-settings.nixbefore bootstrap.