# Stable Private-Only Baseline This document defines the current engineering target for this repository. ## Topology - 3 control planes (HA etcd cluster) - 3 workers - Hetzner Load Balancer for Kubernetes API - private Hetzner network - Tailscale operator access - Rancher UI exposed only through Tailscale (`rancher.silverside-gopher.ts.net`) ## In Scope - Terraform infrastructure bootstrap - Ansible k3s bootstrap with external cloud provider - **HA control plane (3 nodes with etcd quorum)** - **Hetzner Load Balancer for Kubernetes API** - **Hetzner CCM deployed via Ansible (before workers join)** - **Hetzner CSI for persistent volumes (via Flux)** - Flux core reconciliation - External Secrets Operator with Doppler - Tailscale private access - Persistent volume provisioning validated ## Deferred for Later Phases - Observability stack (deferred - complex helm release needs separate debugging) ## Out of Scope - public ingress or DNS - public TLS - app workloads - DR / backup strategy - upgrade strategy ## Phase Gates 1. Terraform apply completes for HA topology (3 CP, 3 workers, 1 LB). 2. Load Balancer is healthy with all 3 control plane targets. 3. Primary control plane bootstraps with `--cluster-init`. 4. Secondary control planes join via Load Balancer endpoint. 5. **CCM deployed via Ansible before workers join** (fixes uninitialized taint issue). 6. Workers join successfully via Load Balancer and all nodes show proper `providerID`. 7. etcd reports 3 healthy members. 8. Flux source and infrastructure reconciliation are healthy. 9. **CSI deploys and creates `hcloud-volumes` StorageClass**. 10. **PVC provisioning tested and working**. 11. External Secrets sync required secrets. 12. Tailscale private access works, including Rancher UI access. 13. Terraform destroy succeeds cleanly or via workflow retry. ## Success Criteria ✅ **ACHIEVED** - HA Cluster with CCM/CSI: - Build 1: Initial CCM/CSI deployment and validation (2026-03-23) - Build 2: Full destroy/rebuild cycle successful (2026-03-23) 🔄 **IN PROGRESS** - HA Control Plane Validation: - Build 3: Deploy 3-3 topology with Load Balancer - Build 4: Destroy/rebuild to validate HA configuration Success requires two consecutive HA rebuilds passing all phase gates with no manual fixes.