feat: migrate cluster baseline from Hetzner to Proxmox
Replace Hetzner infrastructure and cloud-provider assumptions with Proxmox VM clones, kube-vip API HA, and NFS-backed storage. Update bootstrap, Flux addons, CI workflows, and docs to target the new private Proxmox baseline while preserving the existing Tailscale, Doppler, Flux, Rancher, and B2 backup flows.
This commit is contained in:
+12
-14
@@ -5,9 +5,9 @@ This document defines the current engineering target for this repository.
|
||||
## Topology
|
||||
|
||||
- 3 control planes (HA etcd cluster)
|
||||
- 3 workers
|
||||
- Hetzner Load Balancer for Kubernetes API
|
||||
- private Hetzner network
|
||||
- 5 workers
|
||||
- kube-vip API VIP (`10.27.27.40`)
|
||||
- private Proxmox/LAN network (`10.27.27.0/24`)
|
||||
- Tailscale operator access and service exposure
|
||||
- Rancher exposed through Tailscale (`rancher.silverside-gopher.ts.net`)
|
||||
- Grafana exposed through Tailscale (`grafana.silverside-gopher.ts.net`)
|
||||
@@ -17,11 +17,10 @@ This document defines the current engineering target for this repository.
|
||||
## In Scope
|
||||
|
||||
- Terraform infrastructure bootstrap
|
||||
- Ansible k3s bootstrap with external cloud provider
|
||||
- Ansible k3s bootstrap on Ubuntu cloud-init VMs
|
||||
- **HA control plane (3 nodes with etcd quorum)**
|
||||
- **Hetzner Load Balancer for Kubernetes API**
|
||||
- **Hetzner CCM deployed via Ansible (before workers join)**
|
||||
- **Hetzner CSI for persistent volumes (via Flux)**
|
||||
- **kube-vip for Kubernetes API HA**
|
||||
- **NFS-backed persistent volumes via `nfs-subdir-external-provisioner`**
|
||||
- Flux core reconciliation
|
||||
- External Secrets Operator with Doppler
|
||||
- Tailscale private access and smoke-check validation
|
||||
@@ -45,15 +44,14 @@ This document defines the current engineering target for this repository.
|
||||
|
||||
## Phase Gates
|
||||
|
||||
1. Terraform apply completes for HA topology (3 CP, 3 workers, 1 LB).
|
||||
2. Load Balancer is healthy with all 3 control plane targets.
|
||||
3. Primary control plane bootstraps with `--cluster-init`.
|
||||
4. Secondary control planes join via Load Balancer endpoint.
|
||||
5. **CCM deployed via Ansible before workers join** (fixes uninitialized taint issue).
|
||||
6. Workers join successfully via Load Balancer and all nodes show proper `providerID`.
|
||||
1. Terraform apply completes for HA topology (3 CP, 5 workers, 1 VIP).
|
||||
2. Primary control plane bootstraps with `--cluster-init`.
|
||||
3. kube-vip advertises `10.27.27.40:6443` from the control-plane set.
|
||||
4. Secondary control planes join via the kube-vip endpoint.
|
||||
5. Workers join successfully via the kube-vip endpoint.
|
||||
7. etcd reports 3 healthy members.
|
||||
8. Flux source and infrastructure reconciliation are healthy.
|
||||
9. **CSI deploys and creates `hcloud-volumes` StorageClass**.
|
||||
9. **NFS provisioner deploys and creates `flash-nfs` StorageClass**.
|
||||
10. **PVC provisioning tested and working**.
|
||||
11. External Secrets sync required secrets.
|
||||
12. Tailscale private access works for Rancher, Grafana, and Prometheus.
|
||||
|
||||
Reference in New Issue
Block a user