refactor: simplify homelab bootstrap around static IPs and fresh runs
Some checks failed
Terraform Plan / Terraform Plan (push) Failing after 10s
Some checks failed
Terraform Plan / Terraform Plan (push) Failing after 10s
Make Terraform the source of truth for node IPs, remove guest-agent/SSH discovery from the normal workflow path, simplify the bootstrap controller to a fresh-run flow, and swap the initial CNI to Flannel so cluster readiness is easier to prove before reintroducing more complex reconcile behavior.
This commit is contained in:
@@ -50,7 +50,7 @@ sudo nixos-rebuild switch --flake .#cp-1
|
||||
For remote target-host workflows, use your preferred deploy wrapper later
|
||||
(`nixos-rebuild --target-host ...` or deploy-rs/colmena).
|
||||
|
||||
## Bootstrap runbook (kubeadm + kube-vip + Cilium)
|
||||
## Bootstrap runbook (kubeadm + kube-vip + Flannel)
|
||||
|
||||
1. Apply Nix config on all nodes (`cp-*`, then `wk-*`).
|
||||
2. On `cp-1`, run:
|
||||
@@ -62,14 +62,10 @@ sudo th-kubeadm-init
|
||||
This infers the control-plane VIP as `<node-subnet>.250` on `eth0`, creates the
|
||||
kube-vip static pod manifest, and runs `kubeadm init`.
|
||||
|
||||
3. Install Cilium from `cp-1`:
|
||||
3. Install Flannel from `cp-1`:
|
||||
|
||||
```bash
|
||||
helm repo add cilium https://helm.cilium.io
|
||||
helm repo update
|
||||
helm upgrade --install cilium cilium/cilium \
|
||||
--namespace kube-system \
|
||||
--set kubeProxyReplacement=true
|
||||
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.25.5/Documentation/kube-flannel.yml
|
||||
```
|
||||
|
||||
4. Generate join commands on `cp-1`:
|
||||
@@ -98,7 +94,7 @@ kubectl get nodes -o wide
|
||||
kubectl -n kube-system get pods -o wide
|
||||
```
|
||||
|
||||
## Repeatable rebuild flow (recommended)
|
||||
## Fresh bootstrap flow (recommended)
|
||||
|
||||
1. Copy and edit inventory:
|
||||
|
||||
@@ -107,7 +103,7 @@ cp ./scripts/inventory.example.env ./scripts/inventory.env
|
||||
$EDITOR ./scripts/inventory.env
|
||||
```
|
||||
|
||||
2. Rebuild all nodes and bootstrap/reconcile cluster:
|
||||
2. Rebuild all nodes and bootstrap a fresh cluster:
|
||||
|
||||
```bash
|
||||
./scripts/rebuild-and-bootstrap.sh
|
||||
@@ -141,15 +137,15 @@ For a full nuke/recreate lifecycle:
|
||||
- run Terraform destroy/apply for VMs first,
|
||||
- then run `./scripts/rebuild-and-bootstrap.sh` again.
|
||||
|
||||
Node lists are discovered from Terraform outputs, so adding new workers/control
|
||||
planes in Terraform is picked up automatically by the bootstrap/reconcile flow.
|
||||
Node lists now come directly from static Terraform outputs, so bootstrap no longer
|
||||
depends on Proxmox guest-agent IP discovery or SSH subnet scanning.
|
||||
|
||||
## Optional Gitea workflow automation
|
||||
|
||||
Primary flow:
|
||||
|
||||
- Push to `master` triggers `.gitea/workflows/terraform-apply.yml`
|
||||
- That workflow now does Terraform apply and then runs kubeadm rebuild/bootstrap reconciliation automatically
|
||||
- That workflow now does Terraform apply and then runs a fresh kubeadm bootstrap automatically
|
||||
|
||||
Manual dispatch workflows are available:
|
||||
|
||||
@@ -164,9 +160,7 @@ Required repository secrets:
|
||||
Optional secrets:
|
||||
|
||||
- `KUBEADM_SSH_USER` (defaults to `micqdf`)
|
||||
- `KUBEADM_SUBNET_PREFIX` (optional, e.g. `10.27.27`; used for SSH-based IP discovery fallback)
|
||||
|
||||
Node IPs are auto-discovered from Terraform state outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets.
|
||||
Node IPs are rendered directly from static Terraform outputs (`control_plane_vm_ipv4`, `worker_vm_ipv4`), so you do not need per-node IP secrets or SSH discovery fallbacks.
|
||||
|
||||
## Notes
|
||||
|
||||
|
||||
Reference in New Issue
Block a user