TerraHome

Author	SHA1	Message	Date
MichaelFisher1997	a70de061b0	fix: wait for Cilium and node readiness before marking bootstrap success All checks were successful Terraform Plan / Terraform Plan (push) Successful in 18s Details Update verification stage to block on cilium daemonset rollout and all nodes reaching Ready. This prevents workflows from reporting success while the cluster is still NotReady immediately after join.	2026-03-04 22:26:43 +00:00
MichaelFisher1997	5ddd00f711	fix: add join preflight ignores for homelab control planes All checks were successful Terraform Plan / Terraform Plan (push) Successful in 16s Details Append --ignore-preflight-errors=NumCPU,HTTPProxyCIDR to control-plane join commands and HTTPProxyCIDR to worker joins so kubeadm join does not fail on known single-CPU/proxy CIDR checks in this environment.	2026-03-04 21:09:27 +00:00
MichaelFisher1997	034869347a	fix: require kubelet kubeconfig before starting service All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Inline kubelet bootstrap/kubeconfig flags in ExecStart and gate startup on /etc/kubernetes/*kubelet.conf in addition to config.yaml. This prevents kubelet entering standalone mode with webhook auth enabled when no client config is present.	2026-03-04 20:45:47 +00:00
MichaelFisher1997	f0093deedc	fix: avoid assigning control-plane VIP as node SSH address All checks were successful Terraform Plan / Terraform Plan (push) Successful in 15s Details Exclude the configured VIP suffix from subnet scans and prefer non-VIP IPs when multiple SSH endpoints resolve to the same node. This prevents cp-1 being discovered as .250 and later failing SSH commands against the floating VIP.	2026-03-04 19:26:37 +00:00
MichaelFisher1997	6b6ca021c9	fix: add kubelet bootstrap kubeconfig args to systemd unit All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Include KUBELET_KUBECONFIG_ARGS in kubelet ExecStart so kubelet can authenticate with bootstrap-kubelet.conf/kubelet.conf and register node objects during kubeadm init.	2026-03-04 19:26:07 +00:00
micqdf	90ef0ec33f	Merge branch 'master' into stage All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details	2026-03-04 18:42:22 +00:00
MichaelFisher1997	ba6cf42c04	fix: restart kubelet during CRISocket recovery and add registration diagnostics All checks were successful Terraform Plan / Terraform Plan (push) Successful in 16s Details When kubeadm init fails at upload-config/kubelet due missing node object, explicitly restart kubelet to ensure bootstrap flags are loaded before waiting for node registration. Add kubelet flag dump and focused registration log output to surface auth/cert errors.	2026-03-04 18:37:50 +00:00
MichaelFisher1997	3cd0c70727	fix: stop overriding kubelet config in kubeadm init All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Remove custom KubeletConfiguration from init config so kubeadm uses default kubelet authn/authz settings and bootstrap registration path. This avoids the standalone-style kubelet behavior where the node never appears in the API.	2026-03-04 18:35:34 +00:00
micqdf	3281ebd216	Merge pull request 'fix: recover from kubeadm CRISocket node-registration race' (#111 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 18m6s Details Reviewed-on: #111	2026-03-04 03:03:17 +00:00
MichaelFisher1997	d2dd6105a6	fix: recover from kubeadm CRISocket node-registration race All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Handle kubeadm init failures where upload-config/kubelet runs before the node object exists. When that specific error occurs, wait for cp-1 registration and run upload-config kubelet phase explicitly instead of aborting immediately.	2026-03-04 03:00:34 +00:00
micqdf	981afc509a	Merge pull request 'fix: use kubeadm v1beta4 list format for kubeletExtraArgs' (#110 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 19m48s Details Reviewed-on: #110	2026-03-04 02:32:22 +00:00
MichaelFisher1997	b3c975bd73	fix: use kubeadm v1beta4 list format for kubeletExtraArgs All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details kubeadm v1beta4 expects nodeRegistration.kubeletExtraArgs as a list of name/value args, not a map. Switch hostname-override to the correct structure so init config unmarshals successfully.	2026-03-04 02:00:07 +00:00
micqdf	8aab666fad	Merge pull request 'fix: hard reset kubelet identity before kubeadm init' (#109 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 12m25s Details Reviewed-on: #109	2026-03-04 01:42:55 +00:00
MichaelFisher1997	308a2fd4b7	fix: hard reset kubelet identity before kubeadm init All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Clear kubelet cert/bootstrap artifacts after reset and force hostname override in kubeadm nodeRegistration so the node consistently registers as cp-1 instead of inheriting stale template identity.	2026-03-04 01:35:41 +00:00
micqdf	3fd7ed48b1	Merge pull request 'fix: pin kubeadm init node identity to flake hostname' (#108 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 15m22s Details Reviewed-on: #108	2026-03-04 01:18:51 +00:00
MichaelFisher1997	0cc0de2aea	fix: pin kubeadm init node identity to flake hostname All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Set hostname before init and inject nodeRegistration.name into kubeadm InitConfiguration so cp-1 registers as the expected node (cp-1) instead of inheriting the template hostname. This fixes upload-config/kubelet failures caused by node lookup for k8s-base-template.	2026-03-04 01:17:44 +00:00
micqdf	99458ca829	Merge pull request 'fix: force fresh kubeadm init after rebuild and make kubelet enable-able' (#107 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 17m1s Details Reviewed-on: #107	2026-03-04 00:56:30 +00:00
MichaelFisher1997	422b7d7f23	fix: force fresh kubeadm init after rebuild and make kubelet enable-able All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Always re-run primary init when reconcile performs node rebuilds to avoid stale/partial cluster state causing join preflight failures. Also add wantedBy for kubelet so systemctl enable works as expected during join/init flows.	2026-03-04 00:55:20 +00:00
micqdf	adc8a620f4	Merge pull request 'fix: force fresh bootstrap stages after rebuild and stabilize join node identity' (#106 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 20m28s Details Reviewed-on: #106	2026-03-04 00:32:06 +00:00
MichaelFisher1997	3ebeb121b4	fix: force fresh bootstrap stages after rebuild and stabilize join node identity All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Clear completed bootstrap stage checkpoints whenever nodes are rebuilt so reconcile does not skip required init/cni/join work on fresh hosts. Also pass explicit --node-name for control-plane and worker joins, and ensure kubelet is enabled before join commands run.	2026-03-04 00:26:37 +00:00
micqdf	f11aadf79c	Merge pull request 'fix: map SSH-discovered nodes by VMID when hostnames are generic' (#105 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 27m43s Details Reviewed-on: #105	2026-03-03 23:37:45 +00:00
MichaelFisher1997	b4265a649e	fix: map SSH-discovered nodes by VMID when hostnames are generic All checks were successful Terraform Plan / Terraform Plan (push) Successful in 16s Details Some freshly cloned VMs still report template/generic hostnames during discovery. Probe DMI product serial over SSH and map it to Terraform VMIDs so cp-2/cp-3/wk-2 can be resolved even before hostname reconciliation.	2026-03-03 22:16:35 +00:00
micqdf	09d2f56967	Merge pull request 'fix: make SSH inventory discovery more reliable on CI' (#104 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 8m46s Details Reviewed-on: #104	2026-03-03 21:45:57 +00:00
MichaelFisher1997	9ae8eb6134	fix: make SSH inventory discovery more reliable on CI All checks were successful Terraform Plan / Terraform Plan (push) Successful in 16s Details Increase default SSH timeout, reduce scan concurrency, and add a second slower scan pass to avoid transient misses on busy runners. Also print discovered hostnames to improve failure diagnostics when node-name matching fails.	2026-03-03 21:08:29 +00:00
micqdf	f2b9da8a59	Merge pull request 'fix: run Cilium install with sudo and explicit kubeconfig' (#103 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 3m22s Details Reviewed-on: #103	2026-03-03 08:56:49 +00:00
MichaelFisher1997	a66ae788f6	fix: run Cilium install with sudo and explicit kubeconfig All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Use sudo for helm/kubectl on cp-1 and pass /etc/kubernetes/admin.conf so controller can install Cilium without permission errors.	2026-03-03 08:55:22 +00:00
micqdf	5fa96e27d7	Merge pull request 'fix: ensure kubelet is enabled for kubeadm init node registration' (#102 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 10m43s Details Reviewed-on: #102	2026-03-03 01:13:47 +00:00
MichaelFisher1997	cbb8358ce6	fix: ensure kubelet is enabled for kubeadm init node registration All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Enable kubelet before kubeadm init and stop forcing kubelet out of wantedBy so kubeadm can reliably register the node during upload-config/kubelet. Also clear stale kubelet config files during remote prep to avoid restart-loop leftovers.	2026-03-03 01:04:50 +00:00
micqdf	31017b5c3e	Merge pull request 'fix: rebuild nodes by default on reconcile' (#101 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 13m53s Details Reviewed-on: #101	2026-03-03 00:46:26 +00:00
MichaelFisher1997	a16112a87a	fix: rebuild nodes by default on reconcile All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Do not skip node rebuilds unless SKIP_REBUILD=1 is explicitly set. This prevents stale remote helper scripts from being reused across retries after bootstrap logic changes.	2026-03-03 00:34:55 +00:00
micqdf	f53d087c9c	Merge pull request 'fix: use valid kube-vip log flag value' (#100 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 6m29s Details Reviewed-on: #100	2026-03-03 00:26:08 +00:00
MichaelFisher1997	51b56e562e	fix: use valid kube-vip log flag value All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details kube-vip expects an unsigned integer for --log. Replace --log -4 with --log 4 so manifest generation no longer fails during bootstrap.	2026-03-03 00:25:25 +00:00
micqdf	0e0643a6fc	Merge pull request 'refactor: add Python bootstrap controller with resumable state' (#99 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 11m46s Details Reviewed-on: #99	2026-03-03 00:10:19 +00:00
MichaelFisher1997	6fecfb3ee6	refactor: add Python bootstrap controller with resumable state All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Introduce a clean orchestration layer in nixos/kubeadm/bootstrap/controller.py and slim rebuild-and-bootstrap.sh into a thin wrapper. The controller now owns preflight, rebuild, init, CNI install, join, and verify stages with persisted checkpoints on cp-1 plus a local state copy for CI debugging.	2026-03-03 00:09:10 +00:00
micqdf	7a0016b003	Merge pull request 'fix: preserve kube-vip mount path and only swap hostPath to super-admin' (#98 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Has been cancelled Details Reviewed-on: #98	2026-03-03 00:00:48 +00:00
MichaelFisher1997	355273add5	fix: preserve kube-vip mount path and only swap hostPath to super-admin All checks were successful Terraform Plan / Terraform Plan (push) Successful in 19s Details The previous replacement changed both mountPath and hostPath, causing kube-vip to lose its expected in-container kubeconfig path and exit. Keep mountPath at /etc/kubernetes/admin.conf, swap only hostPath during bootstrap, and enable kube-vip debug log level.	2026-03-02 23:59:41 +00:00
micqdf	e5162c220c	Merge pull request 'fix: bootstrap kube-vip without leader election' (#97 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 17m12s Details Reviewed-on: #97	2026-03-02 23:31:52 +00:00
MichaelFisher1997	262e9eb4d7	fix: bootstrap kube-vip without leader election All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Run first-control-plane kube-vip manifest without --leaderElection so VIP can bind before API/RBAC are fully available. Also print kube-vip container exit details on failure.	2026-03-02 23:28:44 +00:00
micqdf	84513f4bb8	Merge pull request 'fix: run kube-vip in control-plane-only mode during bootstrap' (#96 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 16m50s Details Reviewed-on: #96	2026-03-02 22:53:22 +00:00
MichaelFisher1997	c445638d4a	fix: run kube-vip in control-plane-only mode during bootstrap All checks were successful Terraform Plan / Terraform Plan (push) Successful in 17s Details Remove --services from kube-vip static pod manifests for init/join. Service LB mode can crash-loop during kubeadm bootstrap before cluster RBAC is ready, which prevented VIP binding.	2026-03-02 22:52:44 +00:00
micqdf	678b383063	Merge pull request 'stage' (#95 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 17m14s Details Reviewed-on: #95	2026-03-02 22:33:27 +00:00
MichaelFisher1997	880bbcceca	ci: speed up Terraform plan by skipping refresh in pipelines All checks were successful Terraform Plan / Terraform Plan (push) Successful in 16s Details Use terraform plan -refresh=false in plan/apply workflows to avoid slow Proxmox state refresh on every push. This keeps CI fast while preserving apply behavior from the generated plan.	2026-03-02 22:32:10 +00:00
MichaelFisher1997	190dc2e095	fix: restore compatibility with older nixos-rebuild sudo flag Some checks failed Terraform Plan / Terraform Plan (push) Has been cancelled Details Use --use-remote-sudo in rebuild script since the runner's nixos-rebuild does not support --sudo yet.	2026-03-02 22:30:38 +00:00
micqdf	d86b0a32a2	Merge pull request 'fix: stabilize kubeadm bootstrap and reduce Proxmox plan latency' (#94 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 16m3s Details Reviewed-on: #94	2026-03-02 22:13:28 +00:00
MichaelFisher1997	a81799a2b5	fix: stabilize kubeadm bootstrap and reduce Proxmox plan latency Some checks failed Terraform Plan / Terraform Plan (push) Has been cancelled Details Move kubeadm reset ahead of kube-vip manifest generation, use super-admin.conf during bootstrap for kube-vip, and restore admin.conf after init. Also switch nixos-rebuild to --sudo and make QEMU guest agent optional so Terraform plan can skip slow guest-agent refreshes when it is not installed.	2026-03-02 22:09:10 +00:00
micqdf	6c7182b8f5	Merge pull request 'fix: run kube-vip daemon before kubeadm init' (#93 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 24m52s Details Reviewed-on: #93	2026-03-02 21:02:11 +00:00
MichaelFisher1997	46c0786e57	fix: run kube-vip daemon before kubeadm init All checks were successful Terraform Plan / Terraform Plan (push) Successful in 10m8s Details - Start kube-vip as a detached container to claim VIP before kubeadm init - Wait for VIP to be bound before proceeding - Generate static pod manifest for kube-vip - Stop bootstrap kube-vip after API server is healthy (static pod takes over) - Add kube-vip logs output if VIP fails to bind	2026-03-02 20:39:28 +00:00
micqdf	8b15f061bc	Merge pull request 'fix: skip kubeadm wait-control-plane phase, wait for VIP manually' (#92 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 23m51s Details Reviewed-on: #92	2026-03-02 19:42:56 +00:00
MichaelFisher1997	1af45ca51e	fix: skip kubeadm wait-control-plane phase, wait for VIP manually Some checks failed Terraform Plan / Terraform Plan (push) Has been cancelled Details - Use --skip-phases=wait-control-plane to avoid 4-minute timeout - Wait for kube-vip to bind VIP before checking API server health - Add kube-vip logs and VIP status to debug output	2026-03-02 19:37:06 +00:00
micqdf	c91d28a5dc	Merge pull request 'fix: add image pre-pull and debug output for kubeadm init' (#91 ) from stage into master Some checks failed Terraform Apply / Terraform Apply (push) Failing after 26m27s Details Reviewed-on: #91	2026-03-02 18:36:46 +00:00

1 2 3 4 5 ...

331 Commits