kubeadm v1beta4 expects nodeRegistration.kubeletExtraArgs as a list of name/value args, not a map. Switch hostname-override to the correct structure so init config unmarshals successfully.
Clear kubelet cert/bootstrap artifacts after reset and force hostname override in kubeadm nodeRegistration so the node consistently registers as cp-1 instead of inheriting stale template identity.
Set hostname before init and inject nodeRegistration.name into kubeadm InitConfiguration so cp-1 registers as the expected node (cp-1) instead of inheriting the template hostname. This fixes upload-config/kubelet failures caused by node lookup for k8s-base-template.
Always re-run primary init when reconcile performs node rebuilds to avoid stale/partial cluster state causing join preflight failures. Also add wantedBy for kubelet so systemctl enable works as expected during join/init flows.
Clear completed bootstrap stage checkpoints whenever nodes are rebuilt so reconcile does not skip required init/cni/join work on fresh hosts. Also pass explicit --node-name for control-plane and worker joins, and ensure kubelet is enabled before join commands run.
Enable kubelet before kubeadm init and stop forcing kubelet out of wantedBy so kubeadm can reliably register the node during upload-config/kubelet. Also clear stale kubelet config files during remote prep to avoid restart-loop leftovers.
The previous replacement changed both mountPath and hostPath, causing kube-vip to lose its expected in-container kubeconfig path and exit. Keep mountPath at /etc/kubernetes/admin.conf, swap only hostPath during bootstrap, and enable kube-vip debug log level.
Run first-control-plane kube-vip manifest without --leaderElection so VIP can bind before API/RBAC are fully available. Also print kube-vip container exit details on failure.
Remove --services from kube-vip static pod manifests for init/join. Service LB mode can crash-loop during kubeadm bootstrap before cluster RBAC is ready, which prevented VIP binding.
Move kubeadm reset ahead of kube-vip manifest generation, use super-admin.conf during bootstrap for kube-vip, and restore admin.conf after init. Also switch nixos-rebuild to --sudo and make QEMU guest agent optional so Terraform plan can skip slow guest-agent refreshes when it is not installed.
- Start kube-vip as a detached container to claim VIP before kubeadm init
- Wait for VIP to be bound before proceeding
- Generate static pod manifest for kube-vip
- Stop bootstrap kube-vip after API server is healthy (static pod takes over)
- Add kube-vip logs output if VIP fails to bind
- Use --skip-phases=wait-control-plane to avoid 4-minute timeout
- Wait for kube-vip to bind VIP before checking API server health
- Add kube-vip logs and VIP status to debug output
- Add authorization.mode: AlwaysAllow to KubeletConfiguration
- Remove stale kubelet config.yaml before unmasking in all kubeadm scripts
- This prevents 'no client provided, cannot use webhook authorization' error
- Use explicit kubeadm config file with KubeletConfiguration
- Disable webhook authentication which was causing 'no client provided' error
- Add ConditionPathExists to kubelet systemd unit
- Create /var/lib/kubelet and /var/lib/kubelet/pki directories via tmpfiles
- Ensure containerd is running before kubeadm init
- Add kubelet logs output on kubeadm init failure for debugging
- Remove ConditionPathExists from kubelet service definition as it
prevents kubelet from starting when managed by kubeadm
- Add systemctl daemon-reload after unmasking in all kubeadm scripts
- Add reset-failed for consistent state cleanup
- Mask kubelet service entirely before nixos-rebuild to prevent systemd
from restarting it during switch
- Unmask kubelet in th-kubeadm-init/join scripts before starting
Add wantedBy = [] to prevent kubelet from being started by multi-user.target
during nixos-rebuild switch. This allows rebuilds to succeed even when the
cluster is in a transitional state. Kubelet will be started by kubeadm
init/join commands instead.
Provision 3 thin control planes and 3 workers with role-specific sizing and VMID ranges (701/711), generate per-node cloud-init snippets with SSH key injection, and add NixOS kubeadm host/module scaffolding for cp-1..3 and wk-1..3.