24 Commits

Author SHA1 Message Date
5bfc135350 Merge pull request 'fix: ignore stale SSH host keys for ephemeral homelab VMs' (#130) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 19m24s
Reviewed-on: #130
2026-03-09 03:45:11 +00:00
63213a4bc3 fix: ignore stale SSH host keys for ephemeral homelab VMs
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
Fresh destroy/recreate cycles change VM host keys, which was breaking bootstrap after rebuilds. Use a disposable known-hosts policy in the controller SSH options so automation does not fail on expected key rotation.
2026-03-09 03:16:18 +00:00
e4243c7667 Merge pull request 'fix: keep DHCP enabled by default on template VM' (#129) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 1h50m42s
Reviewed-on: #129
2026-03-08 22:03:17 +00:00
33bb0ffb17 fix: keep DHCP enabled by default on template VM
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 14s
The template machine can lose connectivity when rebuilt directly because it has no cloud-init network data during template maintenance. Restore DHCP as the default for the template itself while keeping cloud-init + networkd enabled so cloned VMs can still consume injected network settings.
2026-03-08 20:12:03 +00:00
7434a65590 Merge pull request 'stage' (#128) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 6m54s
Reviewed-on: #128
2026-03-08 18:06:46 +00:00
cd8e538c51 ci: switch checkout action source away from gitea.com mirror
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
The gitea.com checkout action mirror is timing out during workflow startup. Use actions/checkout@v4 directly so jobs do not fail before any repository logic runs.
2026-03-08 13:36:21 +00:00
808c290c71 chore: clarify stale template cloud-init failure message
Some checks failed
Terraform Plan / Terraform Plan (push) Failing after 31s
Make SSH bootstrap failures explain the real root cause when fresh clones never accept the injected user/key: the Proxmox source template itself still needs the updated cloud-init-capable NixOS configuration.
2026-03-08 13:16:37 +00:00
15e6471e7e Merge pull request 'fix: enable cloud-init networking in NixOS template' (#127) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 7m10s
Reviewed-on: #127
2026-03-08 05:33:57 +00:00
79a4c941e5 fix: enable cloud-init networking in NixOS template
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
Freshly recreated VMs were reachable but did not accept the injected SSH key, which indicates Proxmox cloud-init settings were not being applied. Enable cloud-init and cloud-init network handling in the base template so static IPs, hostname, ciuser, and SSH keys take effect on first boot.
2026-03-08 05:16:19 +00:00
e9bac70cae Merge pull request 'fix: wait for SSH readiness after VM provisioning' (#126) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 6m56s
Reviewed-on: #126
2026-03-08 05:04:43 +00:00
4c167f618a fix: wait for SSH readiness after VM provisioning
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 17s
Freshly recreated VMs can take a few minutes before cloud-init users and SSH are available. Retry SSH authentication in the bootstrap controller before failing so rebuild/bootstrap does not abort immediately on new hosts.
2026-03-08 05:00:39 +00:00
97295a7071 Merge pull request 'ci: speed up Terraform destroy plan by skipping refresh' (#125) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 7m0s
Reviewed-on: #125
2026-03-08 04:47:02 +00:00
7bc861b3e8 ci: speed up Terraform destroy plan by skipping refresh
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
Use terraform plan -refresh=false for destroy workflows so manual NUKE runs do not spend minutes refreshing Proxmox VM state before building the destroy plan.
2026-03-08 04:37:52 +00:00
6ca189b32c Merge pull request 'fix: vendor Flannel manifest and harden CNI bootstrap timing' (#124) from stage into master
All checks were successful
Terraform Apply / Terraform Apply (push) Successful in 15m11s
Reviewed-on: #124
2026-03-08 04:10:47 +00:00
b7b364a112 fix: vendor Flannel manifest and harden CNI bootstrap timing
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 17s
Stop depending on GitHub during cluster bring-up by shipping the Flannel manifest in-repo, ensure required host paths exist on NixOS nodes, and wait/retry against a stable API before applying the CNI. This removes the TLS handshake timeout failure mode and makes early network bootstrap deterministic.
2026-03-08 03:24:16 +00:00
2aa9950f59 Merge pull request 'fix: add mount utility to kubelet service PATH' (#123) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 11m10s
Reviewed-on: #123
2026-03-08 02:16:23 +00:00
bd866f7dac fix: add mount utility to kubelet service PATH
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
Flannel pods were stuck because kubelet could not execute mount for projected service account volumes on NixOS. Add util-linux to the kubelet systemd PATH so mount is available during volume setup.
2026-03-07 14:18:20 +00:00
c1f86483ad Merge pull request 'debug: print detailed Flannel pod diagnostics on rollout timeout' (#122) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 23m50s
Reviewed-on: #122
2026-03-07 12:31:43 +00:00
0cce4bcf72 Merge branch 'master' into stage
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
2026-03-07 12:22:01 +00:00
065567210e debug: print detailed Flannel pod diagnostics on rollout timeout
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 18s
When kube-flannel daemonset rollout stalls, print pod descriptions and per-container logs for the init containers and main flannel container so the next failure shows the actual cause instead of only Init:0/2.
2026-03-07 12:19:21 +00:00
c5f0b1ac37 Merge pull request 'stage' (#121) from stage into master
Some checks failed
Terraform Apply / Terraform Apply (push) Failing after 30m28s
Reviewed-on: #121
2026-03-07 01:01:38 +00:00
e740d47011 Merge branch 'master' into stage
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 16s
2026-03-07 00:57:47 +00:00
d9d3976c4c fix: use self-contained Terraform variable validations
All checks were successful
Terraform Plan / Terraform Plan (push) Successful in 17s
Terraform variable validation blocks can only reference the variable under validation. Replace count-based checks with fixed-length validations for the current 3 control planes and 3 workers.
2026-03-07 00:54:51 +00:00
a0b07816b9 refactor: simplify homelab bootstrap around static IPs and fresh runs
Some checks failed
Terraform Plan / Terraform Plan (push) Failing after 10s
Make Terraform the source of truth for node IPs, remove guest-agent/SSH discovery from the normal workflow path, simplify the bootstrap controller to a fresh-run flow, and swap the initial CNI to Flannel so cluster readiness is easier to prove before reintroducing more complex reconcile behavior.
2026-03-07 00:52:35 +00:00
14 changed files with 372 additions and 199 deletions

View File

@@ -27,7 +27,7 @@ jobs:
fi
- name: Checkout repository
uses: https://gitea.com/actions/checkout@v4
uses: actions/checkout@v4
- name: Create SSH key
run: |
@@ -103,25 +103,9 @@ jobs:
- name: Create kubeadm inventory
env:
KUBEADM_SSH_USER: ${{ secrets.KUBEADM_SSH_USER }}
KUBEADM_SUBNET_PREFIX: ${{ secrets.KUBEADM_SUBNET_PREFIX }}
run: |
set -euo pipefail
TF_OUTPUT_JSON=""
for attempt in 1 2 3 4 5 6; do
echo "Inventory render attempt $attempt/6"
TF_OUTPUT_JSON="$(terraform -chdir=terraform output -json)"
if printf '%s' "$TF_OUTPUT_JSON" | ./nixos/kubeadm/scripts/render-inventory-from-tf-output.py > nixos/kubeadm/scripts/inventory.env; then
exit 0
fi
if [ "$attempt" -lt 6 ]; then
echo "VM IPv4s not available yet; waiting 30s before retry"
sleep 30
fi
done
echo "Falling back to SSH-based inventory discovery"
printf '%s' "$TF_OUTPUT_JSON" | ./nixos/kubeadm/scripts/discover-inventory-from-ssh.py > nixos/kubeadm/scripts/inventory.env
terraform -chdir=terraform output -json | ./nixos/kubeadm/scripts/render-inventory-from-tf-output.py > nixos/kubeadm/scripts/inventory.env
- name: Validate nix installation
run: |

View File

@@ -27,7 +27,7 @@ jobs:
fi
- name: Checkout repository
uses: https://gitea.com/actions/checkout@v4
uses: actions/checkout@v4
- name: Create SSH key
run: |
@@ -103,25 +103,9 @@ jobs:
- name: Create kubeadm inventory
env:
KUBEADM_SSH_USER: ${{ secrets.KUBEADM_SSH_USER }}
KUBEADM_SUBNET_PREFIX: ${{ secrets.KUBEADM_SUBNET_PREFIX }}
run: |
set -euo pipefail
TF_OUTPUT_JSON=""
for attempt in 1 2 3 4 5 6; do
echo "Inventory render attempt $attempt/6"
TF_OUTPUT_JSON="$(terraform -chdir=terraform output -json)"
if printf '%s' "$TF_OUTPUT_JSON" | ./nixos/kubeadm/scripts/render-inventory-from-tf-output.py > nixos/kubeadm/scripts/inventory.env; then
exit 0
fi
if [ "$attempt" -lt 6 ]; then
echo "VM IPv4s not available yet; waiting 30s before retry"
sleep 30
fi
done
echo "Falling back to SSH-based inventory discovery"
printf '%s' "$TF_OUTPUT_JSON" | ./nixos/kubeadm/scripts/discover-inventory-from-ssh.py > nixos/kubeadm/scripts/inventory.env
terraform -chdir=terraform output -json | ./nixos/kubeadm/scripts/render-inventory-from-tf-output.py > nixos/kubeadm/scripts/inventory.env
- name: Run cluster reset
run: |

View File

@@ -16,7 +16,7 @@ jobs:
steps:
- name: Checkout repository
uses: https://gitea.com/actions/checkout@v4
uses: actions/checkout@v4
- name: Create secrets.tfvars
working-directory: terraform
@@ -151,25 +151,9 @@ jobs:
- name: Create kubeadm inventory from Terraform outputs
env:
KUBEADM_SSH_USER: ${{ secrets.KUBEADM_SSH_USER }}
KUBEADM_SUBNET_PREFIX: ${{ secrets.KUBEADM_SUBNET_PREFIX }}
run: |
set -euo pipefail
TF_OUTPUT_JSON=""
for attempt in 1 2 3 4 5 6; do
echo "Inventory render attempt $attempt/6"
TF_OUTPUT_JSON="$(terraform -chdir=terraform output -json)"
if printf '%s' "$TF_OUTPUT_JSON" | ./nixos/kubeadm/scripts/render-inventory-from-tf-output.py > nixos/kubeadm/scripts/inventory.env; then
exit 0
fi
if [ "$attempt" -lt 6 ]; then
echo "VM IPv4s not available yet; waiting 30s before retry"
sleep 30
fi
done
echo "Falling back to SSH-based inventory discovery"
printf '%s' "$TF_OUTPUT_JSON" | ./nixos/kubeadm/scripts/discover-inventory-from-ssh.py > nixos/kubeadm/scripts/inventory.env
terraform -chdir=terraform output -json | ./nixos/kubeadm/scripts/render-inventory-from-tf-output.py > nixos/kubeadm/scripts/inventory.env
- name: Ensure nix and nixos-rebuild
env:

View File

@@ -36,7 +36,7 @@ jobs:
fi
- name: Checkout repository
uses: https://gitea.com/actions/checkout@v4
uses: actions/checkout@v4
- name: Create Terraform secret files
working-directory: terraform
@@ -77,13 +77,13 @@ jobs:
set -euo pipefail
case "${{ inputs.target }}" in
all)
TF_PLAN_CMD="terraform plan -parallelism=1 -destroy -out=tfdestroy"
TF_PLAN_CMD="terraform plan -refresh=false -parallelism=1 -destroy -out=tfdestroy"
;;
control-planes)
TF_PLAN_CMD="terraform plan -parallelism=1 -destroy -target=proxmox_vm_qemu.control_planes -out=tfdestroy"
TF_PLAN_CMD="terraform plan -refresh=false -parallelism=1 -destroy -target=proxmox_vm_qemu.control_planes -out=tfdestroy"
;;
workers)
TF_PLAN_CMD="terraform plan -parallelism=1 -destroy -target=proxmox_vm_qemu.workers -out=tfdestroy"
TF_PLAN_CMD="terraform plan -refresh=false -parallelism=1 -destroy -target=proxmox_vm_qemu.workers -out=tfdestroy"
;;
*)
echo "Invalid destroy target: ${{ inputs.target }}"

View File

@@ -17,7 +17,7 @@ jobs:
steps:
- name: Checkout repository
uses: https://gitea.com/actions/checkout@v4
uses: actions/checkout@v4
- name: Create secrets.tfvars
working-directory: terraform

View File

@@ -50,7 +50,7 @@ sudo nixos-rebuild switch --flake .#cp-1
For remote target-host workflows, use your preferred deploy wrapper later
(`nixos-rebuild --target-host ...` or deploy-rs/colmena).
## Bootstrap runbook (kubeadm + kube-vip + Cilium)
## Bootstrap runbook (kubeadm + kube-vip + Flannel)
1. Apply Nix config on all nodes (`cp-*`, then `wk-*`).
2. On `cp-1`, run:
@@ -62,14 +62,10 @@ sudo th-kubeadm-init
This infers the control-plane VIP as `<node-subnet>.250` on `eth0`, creates the
kube-vip static pod manifest, and runs `kubeadm init`.
3. Install Cilium from `cp-1`:
3. Install Flannel from `cp-1`:
```bash
helm repo add cilium https://helm.cilium.io
helm repo update
helm upgrade --install cilium cilium/cilium \
--namespace kube-system \
--set kubeProxyReplacement=true
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.25.5/Documentation/kube-flannel.yml
```
4. Generate join commands on `cp-1`:
@@ -98,7 +94,7 @@ kubectl get nodes -o wide
kubectl -n kube-system get pods -o wide
```
## Repeatable rebuild flow (recommended)
## Fresh bootstrap flow (recommended)
1. Copy and edit inventory:
@@ -107,7 +103,7 @@ cp ./scripts/inventory.example.env ./scripts/inventory.env
$EDITOR ./scripts/inventory.env
```
2. Rebuild all nodes and bootstrap/reconcile cluster:
2. Rebuild all nodes and bootstrap a fresh cluster:
```bash
./scripts/rebuild-and-bootstrap.sh
@@ -141,15 +137,15 @@ For a full nuke/recreate lifecycle:
- run Terraform destroy/apply for VMs first,
- then run `./scripts/rebuild-and-bootstrap.sh` again.
Node lists are discovered from Terraform outputs, so adding new workers/control
planes in Terraform is picked up automatically by the bootstrap/reconcile flow.
Node lists now come directly from static Terraform outputs, so bootstrap no longer
depends on Proxmox guest-agent IP discovery or SSH subnet scanning.
## Optional Gitea workflow automation
Primary flow:
- Push to `master` triggers `.gitea/workflows/terraform-apply.yml`
- That workflow now does Terraform apply and then runs kubeadm rebuild/bootstrap reconciliation automatically
- That workflow now does Terraform apply and then runs a fresh kubeadm bootstrap automatically
Manual dispatch workflows are available:
@@ -164,9 +160,7 @@ Required repository secrets:
Optional secrets:
- `KUBEADM_SSH_USER` (defaults to `micqdf`)
- `KUBEADM_SUBNET_PREFIX` (optional, e.g. `10.27.27`; used for SSH-based IP discovery fallback)
Node IPs are auto-discovered from Terraform state outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets.
Node IPs are rendered directly from static Terraform outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets or SSH discovery fallbacks.
## Notes

View File

@@ -11,9 +11,6 @@ from concurrent.futures import ThreadPoolExecutor, as_completed
from pathlib import Path
REMOTE_STATE_PATH = "/var/lib/terrahome/bootstrap-state.json"
def run_local(cmd, check=True, capture=False):
if isinstance(cmd, str):
shell = True
@@ -102,7 +99,6 @@ class Controller:
self.script_dir = Path(__file__).resolve().parent
self.flake_dir = Path(self.env.get("FLAKE_DIR") or (self.script_dir.parent)).resolve()
self.local_state_path = self.script_dir / "bootstrap-state-last.json"
self.ssh_user = self.env.get("SSH_USER", "micqdf")
self.ssh_candidates = self.env.get("SSH_USER_CANDIDATES", f"root {self.ssh_user}").split()
@@ -114,7 +110,9 @@ class Controller:
"-o",
"IdentitiesOnly=yes",
"-o",
"StrictHostKeyChecking=accept-new",
"StrictHostKeyChecking=no",
"-o",
"UserKnownHostsFile=/dev/null",
"-i",
self.ssh_key,
]
@@ -124,8 +122,9 @@ class Controller:
self.worker_parallelism = int(self.env.get("WORKER_PARALLELISM", "3"))
self.fast_mode = self.env.get("FAST_MODE", "1")
self.skip_rebuild = self.env.get("SKIP_REBUILD", "0") == "1"
self.force_reinit = False
self.cilium_kpr = self.env.get("CILIUM_KUBE_PROXY_REPLACEMENT", "false")
self.force_reinit = True
self.ssh_ready_retries = int(self.env.get("SSH_READY_RETRIES", "20"))
self.ssh_ready_delay = int(self.env.get("SSH_READY_DELAY_SEC", "15"))
def log(self, msg):
print(f"==> {msg}")
@@ -135,13 +134,26 @@ class Controller:
return run_local(full, check=check, capture=True)
def detect_user(self, ip):
for user in self.ssh_candidates:
proc = self._ssh(user, ip, "true", check=False)
if proc.returncode == 0:
self.active_ssh_user = user
self.log(f"Using SSH user '{user}' for {ip}")
return
raise RuntimeError(f"Unable to authenticate to {ip} with users: {', '.join(self.ssh_candidates)}")
for attempt in range(1, self.ssh_ready_retries + 1):
for user in self.ssh_candidates:
proc = self._ssh(user, ip, "true", check=False)
if proc.returncode == 0:
self.active_ssh_user = user
self.log(f"Using SSH user '{user}' for {ip}")
return
if attempt < self.ssh_ready_retries:
self.log(
f"SSH not ready on {ip} yet; retrying in {self.ssh_ready_delay}s "
f"({attempt}/{self.ssh_ready_retries})"
)
time.sleep(self.ssh_ready_delay)
raise RuntimeError(
"Unable to authenticate to "
f"{ip} with users: {', '.join(self.ssh_candidates)}. "
"If this is a freshly cloned VM, the Proxmox source template likely does not yet include the "
"current cloud-init-capable NixOS template configuration from nixos/template-base. "
"Terraform can only clone what exists in Proxmox; it cannot retrofit cloud-init support into an old template."
)
def remote(self, ip, cmd, check=True):
ordered = [self.active_ssh_user] + [u for u in self.ssh_candidates if u != self.active_ssh_user]
@@ -162,53 +174,7 @@ class Controller:
return last
def prepare_known_hosts(self):
ssh_dir = Path.home() / ".ssh"
ssh_dir.mkdir(parents=True, exist_ok=True)
(ssh_dir / "known_hosts").touch()
run_local(["chmod", "700", str(ssh_dir)])
run_local(["chmod", "600", str(ssh_dir / "known_hosts")])
for ip in self.node_ips.values():
run_local(["ssh-keygen", "-R", ip], check=False)
run_local(f"ssh-keyscan -H {shlex.quote(ip)} >> {shlex.quote(str(ssh_dir / 'known_hosts'))}", check=False)
def get_state(self):
proc = self.remote(
self.primary_ip,
"sudo test -f /var/lib/terrahome/bootstrap-state.json && sudo cat /var/lib/terrahome/bootstrap-state.json || echo '{}'",
)
try:
state = json.loads(proc.stdout.strip() or "{}")
except Exception:
state = {}
return state
def set_state(self, state):
payload = json.dumps(state, sort_keys=True)
b64 = base64.b64encode(payload.encode()).decode()
self.remote(
self.primary_ip,
(
"sudo mkdir -p /var/lib/terrahome && "
f"echo {shlex.quote(b64)} | base64 -d | sudo tee {REMOTE_STATE_PATH} >/dev/null"
),
)
self.local_state_path.write_text(payload + "\n", encoding="utf-8")
def mark_done(self, key):
state = self.get_state()
state[key] = True
state["updated_at"] = int(time.time())
self.set_state(state)
def clear_done(self, keys):
state = self.get_state()
for key in keys:
state.pop(key, None)
state["updated_at"] = int(time.time())
self.set_state(state)
def stage_done(self, key):
return bool(self.get_state().get(key))
pass
def prepare_remote_nix(self, ip):
self.remote(ip, "sudo mkdir -p /etc/nix")
@@ -258,15 +224,11 @@ class Controller:
raise RuntimeError(f"Rebuild failed permanently for {name}")
def stage_preflight(self):
if self.stage_done("preflight_done"):
self.log("Preflight already complete")
return
self.prepare_known_hosts()
self.detect_user(self.primary_ip)
self.mark_done("preflight_done")
def stage_rebuild(self):
if self.skip_rebuild and self.stage_done("nodes_rebuilt"):
if self.skip_rebuild:
self.log("Node rebuild already complete")
return
@@ -300,17 +262,6 @@ class Controller:
if failures:
raise RuntimeError(f"Worker rebuild failures: {failures}")
# Rebuild can invalidate prior bootstrap stages; force reconciliation.
self.force_reinit = True
self.clear_done([
"primary_initialized",
"cni_installed",
"control_planes_joined",
"workers_joined",
"verified",
])
self.mark_done("nodes_rebuilt")
def has_admin_conf(self):
return self.remote(self.primary_ip, "sudo test -f /etc/kubernetes/admin.conf", check=False).returncode == 0
@@ -319,44 +270,52 @@ class Controller:
return self.remote(self.primary_ip, cmd, check=False).returncode == 0
def stage_init_primary(self):
if (not self.force_reinit) and self.stage_done("primary_initialized") and self.has_admin_conf() and self.cluster_ready():
self.log("Primary control plane init already complete")
return
if (not self.force_reinit) and self.has_admin_conf() and self.cluster_ready():
self.log("Existing cluster detected on primary control plane")
else:
self.log(f"Initializing primary control plane on {self.primary_cp}")
self.remote(self.primary_ip, "sudo th-kubeadm-init")
self.mark_done("primary_initialized")
self.log(f"Initializing primary control plane on {self.primary_cp}")
self.remote(self.primary_ip, "sudo th-kubeadm-init")
def stage_install_cni(self):
if self.stage_done("cni_installed") and self.cluster_ready():
self.log("CNI install already complete")
return
self.log("Installing or upgrading Cilium")
self.remote(self.primary_ip, "sudo helm repo add cilium https://helm.cilium.io >/dev/null 2>&1 || true")
self.remote(self.primary_ip, "sudo helm repo update >/dev/null")
self.remote(self.primary_ip, "sudo kubectl --kubeconfig /etc/kubernetes/admin.conf create namespace kube-system >/dev/null 2>&1 || true")
self.log("Installing Flannel")
manifest_path = self.script_dir.parent / "manifests" / "kube-flannel.yml"
manifest_b64 = base64.b64encode(manifest_path.read_bytes()).decode()
self.remote(
self.primary_ip,
(
"sudo KUBECONFIG=/etc/kubernetes/admin.conf "
"helm upgrade --install cilium cilium/cilium "
"--namespace kube-system "
f"--set k8sServiceHost={shlex.quote(self.primary_ip)} "
"--set k8sServicePort=6443 "
f"--set kubeProxyReplacement={shlex.quote(self.cilium_kpr)}"
"sudo mkdir -p /var/lib/terrahome && "
f"echo {shlex.quote(manifest_b64)} | base64 -d | sudo tee /var/lib/terrahome/kube-flannel.yml >/dev/null"
),
)
self.mark_done("cni_installed")
self.log("Waiting for API readiness before applying Flannel")
ready = False
for _ in range(30):
if self.cluster_ready():
ready = True
break
time.sleep(10)
if not ready:
raise RuntimeError("API server did not become ready before Flannel install")
last_error = None
for attempt in range(1, 6):
proc = self.remote(
self.primary_ip,
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f /var/lib/terrahome/kube-flannel.yml",
check=False,
)
if proc.returncode == 0:
return
last_error = (proc.stdout or "") + ("\n" if proc.stdout and proc.stderr else "") + (proc.stderr or "")
self.log(f"Flannel apply attempt {attempt}/5 failed; retrying in 15s")
time.sleep(15)
raise RuntimeError(f"Flannel apply failed after retries\n{last_error or ''}")
def cluster_has_node(self, name):
cmd = f"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf get node {shlex.quote(name)} >/dev/null 2>&1"
return self.remote(self.primary_ip, cmd, check=False).returncode == 0
def build_join_cmds(self):
if not self.has_admin_conf():
self.remote(self.primary_ip, "sudo th-kubeadm-init")
join_cmd = self.remote(
self.primary_ip,
"sudo KUBECONFIG=/etc/kubernetes/admin.conf kubeadm token create --print-join-command",
@@ -369,9 +328,6 @@ class Controller:
return join_cmd, cp_join
def stage_join_control_planes(self):
if self.stage_done("control_planes_joined"):
self.log("Control-plane join already complete")
return
_, cp_join = self.build_join_cmds()
for node in self.cp_names:
if node == self.primary_cp:
@@ -383,12 +339,8 @@ class Controller:
ip = self.node_ips[node]
node_join = f"{cp_join} --node-name {node} --ignore-preflight-errors=NumCPU,HTTPProxyCIDR"
self.remote(ip, f"sudo th-kubeadm-join-control-plane {shlex.quote(node_join)}")
self.mark_done("control_planes_joined")
def stage_join_workers(self):
if self.stage_done("workers_joined"):
self.log("Worker join already complete")
return
join_cmd, _ = self.build_join_cmds()
for node in self.wk_names:
if self.cluster_has_node(node):
@@ -398,35 +350,43 @@ class Controller:
ip = self.node_ips[node]
node_join = f"{join_cmd} --node-name {node} --ignore-preflight-errors=HTTPProxyCIDR"
self.remote(ip, f"sudo th-kubeadm-join-worker {shlex.quote(node_join)}")
self.mark_done("workers_joined")
def stage_verify(self):
if self.stage_done("verified"):
self.log("Verification already complete")
return
self.log("Final node verification")
try:
self.remote(
self.primary_ip,
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system rollout status ds/cilium --timeout=10m",
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel rollout status ds/kube-flannel-ds --timeout=10m",
)
except Exception:
self.log("Cilium rollout failed; collecting diagnostics")
self.log("Flannel rollout failed; collecting diagnostics")
proc = self.remote(
self.primary_ip,
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system get ds cilium -o wide || true",
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel get ds -o wide || true",
check=False,
)
print(proc.stdout)
proc = self.remote(
self.primary_ip,
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system get pods -l k8s-app=cilium -o wide || true",
"sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel get pods -o wide || true",
check=False,
)
print(proc.stdout)
proc = self.remote(
self.primary_ip,
"for p in $(sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system get pods -l k8s-app=cilium -o name 2>/dev/null); do sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system logs --tail=120 $p || true; done",
"for p in $(sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel get pods -o name 2>/dev/null); do echo \"--- describe $p ---\"; sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel describe $p || true; done",
check=False,
)
print(proc.stdout)
proc = self.remote(
self.primary_ip,
"for p in $(sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel get pods -o name 2>/dev/null); do echo \"--- logs $p kube-flannel ---\"; sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel logs $p -c kube-flannel --tail=120 || true; echo \"--- logs $p install-cni-plugin ---\"; sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel logs $p -c install-cni-plugin --tail=120 || true; echo \"--- logs $p install-cni ---\"; sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel logs $p -c install-cni --tail=120 || true; done",
check=False,
)
print(proc.stdout)
proc = self.remote(
self.primary_ip,
"for p in $(sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel get pods -o name 2>/dev/null); do sudo kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-flannel logs --tail=120 $p || true; done",
check=False,
)
print(proc.stdout)
@@ -437,7 +397,6 @@ class Controller:
)
proc = self.remote(self.primary_ip, "sudo kubectl --kubeconfig /etc/kubernetes/admin.conf get nodes -o wide")
print(proc.stdout)
self.mark_done("verified")
def reconcile(self):
self.stage_preflight()

View File

@@ -0,0 +1,212 @@
---
kind: Namespace
apiVersion: v1
metadata:
name: kube-flannel
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-flannel
labels:
tier: node
k8s-app: flannel
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"EnableNFTables": false,
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-flannel
labels:
tier: node
app: flannel
k8s-app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
image: docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1
command:
- cp
args:
- -f
- /flannel
- /opt/cni/bin/flannel
volumeMounts:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
image: docker.io/flannel/flannel:v0.25.5
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: docker.io/flannel/flannel:v0.25.5
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
- name: xtables-lock
mountPath: /run/xtables.lock
volumes:
- name: run
hostPath:
path: /run/flannel
type: DirectoryOrCreate
- name: cni-plugin
hostPath:
path: /opt/cni/bin
type: DirectoryOrCreate
- name: cni
hostPath:
path: /etc/cni/net.d
type: DirectoryOrCreate
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate

View File

@@ -384,6 +384,7 @@ in
systemd.services.kubelet = {
description = "Kubernetes Kubelet";
wantedBy = [ "multi-user.target" ];
path = [ pkgs.util-linux ];
wants = [ "network-online.target" ];
after = [ "containerd.service" "network-online.target" ];
serviceConfig = {
@@ -409,6 +410,9 @@ in
systemd.tmpfiles.rules = [
"d /etc/kubernetes 0755 root root -"
"d /etc/kubernetes/manifests 0755 root root -"
"d /etc/cni/net.d 0755 root root -"
"d /opt/cni/bin 0755 root root -"
"d /run/flannel 0755 root root -"
"d /var/lib/kubelet 0755 root root -"
"d /var/lib/kubelet/pki 0755 root root -"
];

View File

@@ -11,6 +11,7 @@ in
networking.hostName = "k8s-base-template";
networking.useDHCP = lib.mkDefault true;
networking.useNetworkd = true;
networking.nameservers = [ "1.1.1.1" "8.8.8.8" ];
boot.loader.systemd-boot.enable = lib.mkForce false;
@@ -20,6 +21,8 @@ in
};
services.qemuGuest.enable = true;
services.cloud-init.enable = true;
services.cloud-init.network.enable = true;
services.openssh.enable = true;
services.openssh.settings = {
PasswordAuthentication = false;

View File

@@ -9,6 +9,15 @@ terraform {
}
}
locals {
control_plane_ipconfig = [
for ip in var.control_plane_ips : "ip=${ip}/${var.network_prefix_length},gw=${var.network_gateway}"
]
worker_ipconfig = [
for ip in var.worker_ips : "ip=${ip}/${var.network_prefix_length},gw=${var.network_gateway}"
]
}
provider "proxmox" {
pm_api_url = var.pm_api_url
pm_api_token_id = var.pm_api_token_id
@@ -35,7 +44,7 @@ resource "proxmox_vm_qemu" "control_planes" {
scsihw = "virtio-scsi-pci"
boot = "order=scsi0"
bootdisk = "scsi0"
ipconfig0 = "ip=dhcp"
ipconfig0 = local.control_plane_ipconfig[count.index]
ciuser = "micqdf"
sshkeys = var.SSH_KEY_PUBLIC
@@ -90,7 +99,7 @@ resource "proxmox_vm_qemu" "workers" {
scsihw = "virtio-scsi-pci"
boot = "order=scsi0"
bootdisk = "scsi0"
ipconfig0 = "ip=dhcp"
ipconfig0 = local.worker_ipconfig[count.index]
ciuser = "micqdf"
sshkeys = var.SSH_KEY_PUBLIC

View File

@@ -11,8 +11,8 @@ output "control_plane_vm_names" {
output "control_plane_vm_ipv4" {
value = {
for vm in proxmox_vm_qemu.control_planes :
vm.name => vm.default_ipv4_address
for i in range(var.control_plane_count) :
proxmox_vm_qemu.control_planes[i].name => var.control_plane_ips[i]
}
}
@@ -29,7 +29,7 @@ output "worker_vm_names" {
output "worker_vm_ipv4" {
value = {
for vm in proxmox_vm_qemu.workers :
vm.name => vm.default_ipv4_address
for i in range(var.worker_count) :
proxmox_vm_qemu.workers[i].name => var.worker_ips[i]
}
}

View File

@@ -17,3 +17,9 @@ control_plane_disk_size = "80G"
worker_cores = [4, 4, 4]
worker_memory_mb = [12288, 12288, 12288]
worker_disk_size = "120G"
network_prefix_length = 10
network_gateway = "10.27.27.1"
control_plane_ips = ["10.27.27.50", "10.27.27.51", "10.27.27.49"]
worker_ips = ["10.27.27.47", "10.27.27.46", "10.27.27.48"]

View File

@@ -87,6 +87,40 @@ variable "worker_disk_size" {
description = "Disk size for worker VMs"
}
variable "network_prefix_length" {
type = number
default = 10
description = "CIDR prefix length for static VM addresses"
}
variable "network_gateway" {
type = string
default = "10.27.27.1"
description = "Gateway for static VM addresses"
}
variable "control_plane_ips" {
type = list(string)
default = ["10.27.27.50", "10.27.27.51", "10.27.27.49"]
description = "Static IPv4 addresses for control plane VMs"
validation {
condition = length(var.control_plane_ips) == 3
error_message = "control_plane_ips must contain exactly 3 IPs."
}
}
variable "worker_ips" {
type = list(string)
default = ["10.27.27.47", "10.27.27.46", "10.27.27.48"]
description = "Static IPv4 addresses for worker VMs"
validation {
condition = length(var.worker_ips) == 3
error_message = "worker_ips must contain exactly 3 IPs."
}
}
variable "bridge" {
type = string
}