# Kubeadm Cluster Layout (NixOS) This folder defines role-based NixOS configs for a kubeadm cluster. ## Topology - Control planes: `cp-1`, `cp-2`, `cp-3` - Workers: `wk-1`, `wk-2`, `wk-3` ## What this provides - Shared Kubernetes/node prerequisites in `modules/k8s-common.nix` - Shared cluster defaults in `modules/k8s-cluster-settings.nix` - Role-specific settings for control planes and workers - Generated per-node host configs from `flake.nix` (no duplicated host files) - Bootstrap helper commands on each node: - `th-kubeadm-init` - `th-kubeadm-join-control-plane` - `th-kubeadm-join-worker` - `th-kubeadm-status` - A Python bootstrap controller for orchestration: - `bootstrap/controller.py` ## Layered architecture - `terraform/`: VM lifecycle only - `nixos/kubeadm/modules/`: declarative node OS config only - `nixos/kubeadm/bootstrap/controller.py`: imperative cluster reconciliation state machine ## Hardware config files The flake automatically imports `hosts/hardware/.nix` if present. Copy each node's generated hardware config into this folder: ```bash sudo nixos-generate-config sudo cp /etc/nixos/hardware-configuration.nix ./hosts/hardware/cp-1.nix ``` Repeat for each node (`cp-2`, `cp-3`, `wk-1`, `wk-2`, `wk-3`). ## Deploy approach Start from one node at a time while experimenting: ```bash sudo nixos-rebuild switch --flake .#cp-1 ``` For remote target-host workflows, use your preferred deploy wrapper later (`nixos-rebuild --target-host ...` or deploy-rs/colmena). ## Bootstrap runbook (kubeadm + kube-vip + Flannel) 1. Apply Nix config on all nodes (`cp-*`, then `wk-*`). 2. On `cp-1`, run: ```bash sudo th-kubeadm-init ``` This infers the control-plane VIP as `.250` on `eth0`, creates the kube-vip static pod manifest, and runs `kubeadm init`. 3. Install Flannel from `cp-1`: ```bash kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.25.5/Documentation/kube-flannel.yml ``` 4. Generate join commands on `cp-1`: ```bash sudo kubeadm token create --print-join-command sudo kubeadm init phase upload-certs --upload-certs ``` 5. Join `cp-2` and `cp-3`: ```bash sudo th-kubeadm-join-control-plane '' ``` 6. Join workers: ```bash sudo th-kubeadm-join-worker '' ``` 7. Validate from a control plane: ```bash kubectl get nodes -o wide kubectl -n kube-system get pods -o wide ``` ## Fresh bootstrap flow (recommended) 1. Copy and edit inventory: ```bash cp ./scripts/inventory.example.env ./scripts/inventory.env $EDITOR ./scripts/inventory.env ``` 2. Rebuild all nodes and bootstrap a fresh cluster: ```bash ./scripts/rebuild-and-bootstrap.sh ``` Optional tuning env vars: ```bash FAST_MODE=1 WORKER_PARALLELISM=3 REBUILD_TIMEOUT=45m REBUILD_RETRIES=2 ./scripts/rebuild-and-bootstrap.sh ``` - `FAST_MODE=1` skips pre-rebuild remote GC cleanup to reduce wall-clock time. - Set `FAST_MODE=0` for a slower but more aggressive space cleanup pass. ### Bootstrap controller state The controller stores checkpoints in both places: - Remote (source of truth): `/var/lib/terrahome/bootstrap-state.json` on `cp-1` - Local copy (workflow/debug artifact): `nixos/kubeadm/bootstrap/bootstrap-state-last.json` This makes retries resumable and keeps failure context visible from CI. 3. If you only want to reset Kubernetes state on existing VMs: ```bash ./scripts/reset-cluster-nodes.sh ``` For a full nuke/recreate lifecycle: - run Terraform destroy/apply for VMs first, - then run `./scripts/rebuild-and-bootstrap.sh` again. Node lists now come directly from static Terraform outputs, so bootstrap no longer depends on Proxmox guest-agent IP discovery or SSH subnet scanning. ## Optional Gitea workflow automation Primary flow: - Push to `master` triggers `.gitea/workflows/terraform-apply.yml` - That workflow now does Terraform apply and then runs a fresh kubeadm bootstrap automatically Manual dispatch workflows are available: - `.gitea/workflows/kubeadm-bootstrap.yml` - `.gitea/workflows/kubeadm-reset.yml` Required repository secrets: - Existing Terraform/backend secrets used by current workflows (`B2_*`, `PM_API_TOKEN_SECRET`, `SSH_KEY_PUBLIC`) - SSH private key: prefer `KUBEADM_SSH_PRIVATE_KEY`, fallback to existing `SSH_KEY_PRIVATE` Optional secrets: - `KUBEADM_SSH_USER` (defaults to `micqdf`) Node IPs are rendered directly from static Terraform outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets or SSH discovery fallbacks. ## Notes - Scripts are intentionally manual-triggered (predictable for homelab bring-up). - If `.250` on the node subnet is already in use, change `controlPlaneVipSuffix` in `modules/k8s-cluster-settings.nix` before bootstrap.