Files

MichaelFisher1997 a70de061b0

Terraform Plan / Terraform Plan (push) Successful in 18s

Details

fix: wait for Cilium and node readiness before marking bootstrap success

Update verification stage to block on cilium daemonset rollout and all nodes reaching Ready. This prevents workflows from reporting success while the cluster is still NotReady immediately after join.

2026-03-04 22:26:43 +00:00

bootstrap

fix: wait for Cilium and node readiness before marking bootstrap success

2026-03-04 22:26:43 +00:00

hosts/hardware

refactor: generate kubeadm host configs from flake

2026-02-28 16:09:05 +00:00

modules

fix: require kubelet kubeconfig before starting service

2026-03-04 20:45:47 +00:00

scripts

fix: avoid assigning control-plane VIP as node SSH address

2026-03-04 19:26:37 +00:00

flake.lock

fix: escape shell expansion in kubeadm helper scripts

2026-02-28 16:12:25 +00:00

flake.nix

chore: add lightweight flake checks for kubeadm configs

2026-02-28 16:19:37 +00:00

README.md

refactor: add Python bootstrap controller with resumable state

2026-03-03 00:09:10 +00:00

README.md

Kubeadm Cluster Layout (NixOS)

This folder defines role-based NixOS configs for a kubeadm cluster.

Topology

Control planes: cp-1, cp-2, cp-3
Workers: wk-1, wk-2, wk-3

What this provides

Shared Kubernetes/node prerequisites in modules/k8s-common.nix
Shared cluster defaults in modules/k8s-cluster-settings.nix
Role-specific settings for control planes and workers
Generated per-node host configs from flake.nix (no duplicated host files)
Bootstrap helper commands on each node:
- th-kubeadm-init
- th-kubeadm-join-control-plane
- th-kubeadm-join-worker
- th-kubeadm-status
A Python bootstrap controller for orchestration:
- bootstrap/controller.py

Layered architecture

terraform/: VM lifecycle only
nixos/kubeadm/modules/: declarative node OS config only
nixos/kubeadm/bootstrap/controller.py: imperative cluster reconciliation state machine

Hardware config files

The flake automatically imports hosts/hardware/<host>.nix if present. Copy each node's generated hardware config into this folder:

sudo nixos-generate-config
sudo cp /etc/nixos/hardware-configuration.nix ./hosts/hardware/cp-1.nix

Repeat for each node (cp-2, cp-3, wk-1, wk-2, wk-3).

Deploy approach

Start from one node at a time while experimenting:

sudo nixos-rebuild switch --flake .#cp-1

For remote target-host workflows, use your preferred deploy wrapper later (nixos-rebuild --target-host ... or deploy-rs/colmena).

Bootstrap runbook (kubeadm + kube-vip + Cilium)

Apply Nix config on all nodes (cp-*, then wk-*).
On cp-1, run:

sudo th-kubeadm-init

This infers the control-plane VIP as <node-subnet>.250 on eth0, creates the kube-vip static pod manifest, and runs kubeadm init.

Install Cilium from cp-1:

helm repo add cilium https://helm.cilium.io
helm repo update
helm upgrade --install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=true

Generate join commands on cp-1:

sudo kubeadm token create --print-join-command
sudo kubeadm init phase upload-certs --upload-certs

Join cp-2 and cp-3:

sudo th-kubeadm-join-control-plane '<kubeadm join ... --control-plane --certificate-key ...>'

Join workers:

sudo th-kubeadm-join-worker '<kubeadm join ...>'

Validate from a control plane:

kubectl get nodes -o wide
kubectl -n kube-system get pods -o wide

Repeatable rebuild flow (recommended)

Copy and edit inventory:

cp ./scripts/inventory.example.env ./scripts/inventory.env
$EDITOR ./scripts/inventory.env

Rebuild all nodes and bootstrap/reconcile cluster:

./scripts/rebuild-and-bootstrap.sh

Optional tuning env vars:

FAST_MODE=1 WORKER_PARALLELISM=3 REBUILD_TIMEOUT=45m REBUILD_RETRIES=2 ./scripts/rebuild-and-bootstrap.sh

FAST_MODE=1 skips pre-rebuild remote GC cleanup to reduce wall-clock time.
Set FAST_MODE=0 for a slower but more aggressive space cleanup pass.

Bootstrap controller state

The controller stores checkpoints in both places:

Remote (source of truth): /var/lib/terrahome/bootstrap-state.json on cp-1
Local copy (workflow/debug artifact): nixos/kubeadm/bootstrap/bootstrap-state-last.json

This makes retries resumable and keeps failure context visible from CI.

If you only want to reset Kubernetes state on existing VMs:

./scripts/reset-cluster-nodes.sh

For a full nuke/recreate lifecycle:

run Terraform destroy/apply for VMs first,
then run ./scripts/rebuild-and-bootstrap.sh again.

Node lists are discovered from Terraform outputs, so adding new workers/control planes in Terraform is picked up automatically by the bootstrap/reconcile flow.

Optional Gitea workflow automation

Primary flow:

Push to master triggers .gitea/workflows/terraform-apply.yml
That workflow now does Terraform apply and then runs kubeadm rebuild/bootstrap reconciliation automatically

Manual dispatch workflows are available:

.gitea/workflows/kubeadm-bootstrap.yml
.gitea/workflows/kubeadm-reset.yml

Required repository secrets:

Existing Terraform/backend secrets used by current workflows (B2_*, PM_API_TOKEN_SECRET, SSH_KEY_PUBLIC)
SSH private key: prefer KUBEADM_SSH_PRIVATE_KEY, fallback to existing SSH_KEY_PRIVATE

Optional secrets:

KUBEADM_SSH_USER (defaults to micqdf)
KUBEADM_SUBNET_PREFIX (optional, e.g. 10.27.27; used for SSH-based IP discovery fallback)

Node IPs are auto-discovered from Terraform state outputs (control_plane_vm_ipv4, worker_vm_ipv4), so you do not need per-node IP secrets.

Notes

Scripts are intentionally manual-triggered (predictable for homelab bring-up).
If .250 on the node subnet is already in use, change controlPlaneVipSuffix in modules/k8s-cluster-settings.nix before bootstrap.