Fresh destroy/recreate cycles change VM host keys, which was breaking bootstrap after rebuilds. Use a disposable known-hosts policy in the controller SSH options so automation does not fail on expected key rotation.
The template machine can lose connectivity when rebuilt directly because it has no cloud-init network data during template maintenance. Restore DHCP as the default for the template itself while keeping cloud-init + networkd enabled so cloned VMs can still consume injected network settings.
The gitea.com checkout action mirror is timing out during workflow startup. Use actions/checkout@v4 directly so jobs do not fail before any repository logic runs.
Make SSH bootstrap failures explain the real root cause when fresh clones never accept the injected user/key: the Proxmox source template itself still needs the updated cloud-init-capable NixOS configuration.
Freshly recreated VMs were reachable but did not accept the injected SSH key, which indicates Proxmox cloud-init settings were not being applied. Enable cloud-init and cloud-init network handling in the base template so static IPs, hostname, ciuser, and SSH keys take effect on first boot.
Freshly recreated VMs can take a few minutes before cloud-init users and SSH are available. Retry SSH authentication in the bootstrap controller before failing so rebuild/bootstrap does not abort immediately on new hosts.
Use terraform plan -refresh=false for destroy workflows so manual NUKE runs do not spend minutes refreshing Proxmox VM state before building the destroy plan.
Stop depending on GitHub during cluster bring-up by shipping the Flannel manifest in-repo, ensure required host paths exist on NixOS nodes, and wait/retry against a stable API before applying the CNI. This removes the TLS handshake timeout failure mode and makes early network bootstrap deterministic.
Flannel pods were stuck because kubelet could not execute mount for projected service account volumes on NixOS. Add util-linux to the kubelet systemd PATH so mount is available during volume setup.
When kube-flannel daemonset rollout stalls, print pod descriptions and per-container logs for the init containers and main flannel container so the next failure shows the actual cause instead of only Init:0/2.
Terraform variable validation blocks can only reference the variable under validation. Replace count-based checks with fixed-length validations for the current 3 control planes and 3 workers.
Make Terraform the source of truth for node IPs, remove guest-agent/SSH discovery from the normal workflow path, simplify the bootstrap controller to a fresh-run flow, and swap the initial CNI to Flannel so cluster readiness is easier to prove before reintroducing more complex reconcile behavior.
Set Cilium k8sServiceHost/k8sServicePort to the primary control-plane API endpoint to avoid in-cluster service routing dependency during early bootstrap. Also print cilium daemonset/pod/log diagnostics when rollout times out.
Set Cilium kubeProxyReplacement from env (default false for homelab stability) and collect cilium daemonset/pod/log diagnostics when rollout times out during verification.
Before control-plane and worker joins, remove stale kubelet/kubernetes identity files and run kubeadm reset -f. This prevents preflight failures like FileAvailable--etc-kubernetes-kubelet.conf during repeated reconcile attempts.
When Proxmox guest-agent IPs are empty and SSH discovery returns duplicate generic hostnames (e.g. flex), assign remaining missing nodes from unmatched SSH-reachable IPs in deterministic order. Also emit SSH-reachable IP diagnostics on failure.
Update verification stage to block on cilium daemonset rollout and all nodes reaching Ready. This prevents workflows from reporting success while the cluster is still NotReady immediately after join.
Append --ignore-preflight-errors=NumCPU,HTTPProxyCIDR to control-plane join commands and HTTPProxyCIDR to worker joins so kubeadm join does not fail on known single-CPU/proxy CIDR checks in this environment.
Inline kubelet bootstrap/kubeconfig flags in ExecStart and gate startup on /etc/kubernetes/*kubelet.conf in addition to config.yaml. This prevents kubelet entering standalone mode with webhook auth enabled when no client config is present.
Exclude the configured VIP suffix from subnet scans and prefer non-VIP IPs when multiple SSH endpoints resolve to the same node. This prevents cp-1 being discovered as .250 and later failing SSH commands against the floating VIP.
Include KUBELET_KUBECONFIG_ARGS in kubelet ExecStart so kubelet can authenticate with bootstrap-kubelet.conf/kubelet.conf and register node objects during kubeadm init.
When kubeadm init fails at upload-config/kubelet due missing node object, explicitly restart kubelet to ensure bootstrap flags are loaded before waiting for node registration. Add kubelet flag dump and focused registration log output to surface auth/cert errors.
Remove custom KubeletConfiguration from init config so kubeadm uses default kubelet authn/authz settings and bootstrap registration path. This avoids the standalone-style kubelet behavior where the node never appears in the API.
Handle kubeadm init failures where upload-config/kubelet runs before the node object exists. When that specific error occurs, wait for cp-1 registration and run upload-config kubelet phase explicitly instead of aborting immediately.
kubeadm v1beta4 expects nodeRegistration.kubeletExtraArgs as a list of name/value args, not a map. Switch hostname-override to the correct structure so init config unmarshals successfully.
Clear kubelet cert/bootstrap artifacts after reset and force hostname override in kubeadm nodeRegistration so the node consistently registers as cp-1 instead of inheriting stale template identity.
Set hostname before init and inject nodeRegistration.name into kubeadm InitConfiguration so cp-1 registers as the expected node (cp-1) instead of inheriting the template hostname. This fixes upload-config/kubelet failures caused by node lookup for k8s-base-template.
Always re-run primary init when reconcile performs node rebuilds to avoid stale/partial cluster state causing join preflight failures. Also add wantedBy for kubelet so systemctl enable works as expected during join/init flows.
Clear completed bootstrap stage checkpoints whenever nodes are rebuilt so reconcile does not skip required init/cni/join work on fresh hosts. Also pass explicit --node-name for control-plane and worker joins, and ensure kubelet is enabled before join commands run.
Some freshly cloned VMs still report template/generic hostnames during discovery. Probe DMI product serial over SSH and map it to Terraform VMIDs so cp-2/cp-3/wk-2 can be resolved even before hostname reconciliation.
Increase default SSH timeout, reduce scan concurrency, and add a second slower scan pass to avoid transient misses on busy runners. Also print discovered hostnames to improve failure diagnostics when node-name matching fails.
Enable kubelet before kubeadm init and stop forcing kubelet out of wantedBy so kubeadm can reliably register the node during upload-config/kubelet. Also clear stale kubelet config files during remote prep to avoid restart-loop leftovers.
Do not skip node rebuilds unless SKIP_REBUILD=1 is explicitly set. This prevents stale remote helper scripts from being reused across retries after bootstrap logic changes.
Introduce a clean orchestration layer in nixos/kubeadm/bootstrap/controller.py and slim rebuild-and-bootstrap.sh into a thin wrapper. The controller now owns preflight, rebuild, init, CNI install, join, and verify stages with persisted checkpoints on cp-1 plus a local state copy for CI debugging.
The previous replacement changed both mountPath and hostPath, causing kube-vip to lose its expected in-container kubeconfig path and exit. Keep mountPath at /etc/kubernetes/admin.conf, swap only hostPath during bootstrap, and enable kube-vip debug log level.
Run first-control-plane kube-vip manifest without --leaderElection so VIP can bind before API/RBAC are fully available. Also print kube-vip container exit details on failure.
Remove --services from kube-vip static pod manifests for init/join. Service LB mode can crash-loop during kubeadm bootstrap before cluster RBAC is ready, which prevented VIP binding.
Use terraform plan -refresh=false in plan/apply workflows to avoid slow Proxmox state refresh on every push. This keeps CI fast while preserving apply behavior from the generated plan.
Move kubeadm reset ahead of kube-vip manifest generation, use super-admin.conf during bootstrap for kube-vip, and restore admin.conf after init. Also switch nixos-rebuild to --sudo and make QEMU guest agent optional so Terraform plan can skip slow guest-agent refreshes when it is not installed.
- Start kube-vip as a detached container to claim VIP before kubeadm init
- Wait for VIP to be bound before proceeding
- Generate static pod manifest for kube-vip
- Stop bootstrap kube-vip after API server is healthy (static pod takes over)
- Add kube-vip logs output if VIP fails to bind
- Use --skip-phases=wait-control-plane to avoid 4-minute timeout
- Wait for kube-vip to bind VIP before checking API server health
- Add kube-vip logs and VIP status to debug output
- Add authorization.mode: AlwaysAllow to KubeletConfiguration
- Remove stale kubelet config.yaml before unmasking in all kubeadm scripts
- This prevents 'no client provided, cannot use webhook authorization' error
- Use explicit kubeadm config file with KubeletConfiguration
- Disable webhook authentication which was causing 'no client provided' error
- Add ConditionPathExists to kubelet systemd unit
- Create /var/lib/kubelet and /var/lib/kubelet/pki directories via tmpfiles
- Ensure containerd is running before kubeadm init
- Add kubelet logs output on kubeadm init failure for debugging
- Remove ConditionPathExists from kubelet service definition as it
prevents kubelet from starting when managed by kubeadm
- Add systemctl daemon-reload after unmasking in all kubeadm scripts
- Add reset-failed for consistent state cleanup
- Mask kubelet service entirely before nixos-rebuild to prevent systemd
from restarting it during switch
- Unmask kubelet in th-kubeadm-init/join scripts before starting
Add wantedBy = [] to prevent kubelet from being started by multi-user.target
during nixos-rebuild switch. This allows rebuilds to succeed even when the
cluster is in a transitional state. Kubelet will be started by kubeadm
init/join commands instead.
-`FAST_MODE=1` skips pre-rebuild remote GC cleanup to reduce wall-clock time.
- Set `FAST_MODE=0` for a slower but more aggressive space cleanup pass.
### Bootstrap controller state
The controller stores checkpoints in both places:
- Remote (source of truth): `/var/lib/terrahome/bootstrap-state.json` on `cp-1`
- Local copy (workflow/debug artifact): `nixos/kubeadm/bootstrap/bootstrap-state-last.json`
This makes retries resumable and keeps failure context visible from CI.
3. If you only want to reset Kubernetes state on existing VMs:
```bash
@@ -115,15 +137,15 @@ For a full nuke/recreate lifecycle:
- run Terraform destroy/apply for VMs first,
- then run `./scripts/rebuild-and-bootstrap.sh` again.
Node lists are discovered from Terraform outputs, so adding new workers/control
planes in Terraform is picked up automatically by the bootstrap/reconcile flow.
Node lists now come directly from static Terraform outputs, so bootstrap no longer
depends on Proxmox guest-agent IP discovery or SSH subnet scanning.
## Optional Gitea workflow automation
Primary flow:
- Push to `master` triggers `.gitea/workflows/terraform-apply.yml`
- That workflow now does Terraform apply and then runs kubeadm rebuild/bootstrap reconciliation automatically
- That workflow now does Terraform apply and then runs a fresh kubeadm bootstrap automatically
Manual dispatch workflows are available:
@@ -138,8 +160,7 @@ Required repository secrets:
Optional secrets:
-`KUBEADM_SSH_USER` (defaults to `micqdf`)
Node IPs are auto-discovered from Terraform state outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets.
Node IPs are rendered directly from static Terraform outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets or SSH discovery fallbacks.
description="Static IPv4 addresses for worker VMs"
validation{
condition= length(var.worker_ips)==3
error_message="worker_ips must contain exactly 3 IPs."
}
}
variable"bridge"{
type=string
}
@@ -99,6 +133,12 @@ variable "pm_api_url" {
type=string
}
variable"qemu_agent_enabled"{
type=bool
default=false
description="Enable QEMU guest agent integration in Proxmox resources"
}
variable"SSH_KEY_PUBLIC"{
type=string
description="Public SSH key injected via cloud-init"
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.